Nucleic and proteinic acids corresponding to human gene ABC1

ABSTRACT

The invention relates to nucleic acids corresponding to different exons and introns of gene ABC1 which is shown to be a gene causing pathologies linked to cholesterol metabolism dysfunction causing diseases such as atherosclerosis, more particularly perturbation of reverse cholesterol transport and more particularly the FHD&#39;s such as Tangier Disease

[0001] The present invention relates to nucleic acids corresponding to the various exons and introns of the ABC1 gene, for which it is now demonstrated that it is a causal gene for pathologies linked to a cholesterol metabolism dysfunction inducing diseases such as atherosclerosis, more particularly disruption in the reverse transport of cholesterol, and more particularly familial HDL deficiencies (FHD), such as Tangier disease. The invention also relates to means for the detection of polymorphisms in general, and of mutations in particular, in the ABC1 gene or in the corresponding protein produced by the allelic form of the ABC1 gene. The invention also provides pharmaceutical compositions comprising a nucleic acid containing the coding region of the ABC1 gene and pharmaceutical compositions containing the ABC1 protein intended for the treatment of diseases linked to a deficiency in the reverse transport of cholesterol, such as Tangier disease. The invention also provides methods for screening small molecules acting on the ABC1 protein which may by itself constitute products acting on the reverse transport of cholesterol and as such may make it possible to effectively combat atherosclerosis from a therapeutic point of view.

[0002] High-density lipoproteins (HDL) are one of the four major classes of lipoproteins circulating in blood plasma.

[0003] These lipoproteins are involved in various metabolic pathways such as lipid transport, the formation of bile acids, steroidogenesis, cell proliferation and, in addition, interfere with the plasma proteinase systems.

[0004] HDLs are perfect free cholesterol acceptors and, in combination with the cholesterol ester transfer proteins (CETP), lipoprotein lipase (LPL), hepatic lipase (HL) and lecithin:cholesterol acyltransferase (LCAT), play a major role in the reverse transport of cholesterol, that is to say the transport of excess cholesterol in the peripheral cells to the liver for its elimination from the body in the form of bile acid.

[0005] It has been demonstrated that the HDLs play a central role in the transport of cholesterol from the peripheral tissues to the liver.

[0006] Various diseases linked to an HDL deficiency have been described, including Tangier disease, HOL deficiency and LCAT deficiency.

[0007] The deficiency involved in Tangier disease is linked to a cellular defect in the translocation of cellular cholesterol which cause a degradation of the HDLs. Nevertheless, for Tangier disease, the exact nature of the defect has not yet been precisely defined.

[0008] In Tangier disease, this cellular defect leads to a disruption in the lipoprotein metabolism. The HDL particles not incorporating cholesterol from the peripheral cells and not being able to be metabolized correctly, are rapidly eliminated from the body. The plasma HDL concentration in these patients is therefore extremely reduced and the HDLs no longer ensure the return of cholesterol to the liver. This cholesterol accumulates in these peripheral cells and cause characteristic clinical manifestations such as the formation of orange-colored tonsils. Furthermore, other lipoprotein disruptions such as overproduction of triglycerides as well as increased synthesis and intracellular catabolism of phospholipids are observed.

[0009] Tangier disease, whose symptoms have been described above, is classified among the familial conditions linked to the metabolism of HDLs which are the ones most commonly detected in patients affected by coronary diseases.

[0010] Numerous studies have shown that a reduced level of HDL cholesterol is an excellent risk factor which makes it possible to detect a coronary condition.

[0011] In this context, syndromes linked to HDL deficiencies have been of increasing interest for the past decade because they make it possible to increase understanding of the role of HDLs in atherogenesis.

[0012] Several mutations in the apo A-I gene have been characterized. These mutations are rare and may lead to a lack of production of apo A-I.

[0013] Mutations in the genes encoding lipoprotein lipase (LPL) or its activator apo C-II are associated with severe hypertriglyceridemias and substantially reduced HDL-c levels.

[0014] Mutations in the gene encoding the enzyme lecithin;cholesterol acyltransferase (LCAT) are also associated with a severe HDL deficiency.

[0015] Furthermore, dysfunctions in the reverse transport of cholesterol may be induced by physiological deficiencies affecting one or more of the steps in the transport of stored cholesterol, from the intracellular vesicles to the membrane surface where it is accepted by the HDLs.

[0016] An increasing need therefore exists in the state of the art to identify genes involved in any of the steps in the metabolism of cholesterol and/or lipoproteins, and in particular genes associated with dysfunctions in the reverse transport of cholesterol from the peripheral cells to the liver.

[0017] Recently, a study was carried out of the segregation of different allelic forms of 343 microsatellite markers distributed over the entire genome and distant from each other by 10.3 cM on average.

[0018] The linkage study was carried out on a family which has been well characterized over eleven generations, in which many members are affected by Tangier disease, the family comprising five consanguineous lines.

[0019] This study made it possible to identify a region located in the 9q31 locus of human chromosome 9 which is statistically associated with the condition (Rust S. et al., Nature Genetics, vol. 20, September 1998, pages 96-98).

[0020] However, the study by RUST et al. only defines a wide region of the genome whose impairments are likely to be associated with Tangier disease. It is simply specified that the relevant 9q31-34 region contains ESTs but no known gene.

[0021] It has since been shown according to the invention that a region of about 1 cM situated in the 9q31 locus in humans was generally associated with familial HDL deficiencies.

[0022] More precisely, it has been shown that a gene encoding a protein of the family of ABC transporters, which is located precisely in the region of 1 cM of the 9q31 locus, was involved in pathologies linked to a deficiency in the reverse transport of cholesterol.

[0023] More particularly, it has been shown according to the invention that the gene encoding the ABC-1 transporter was mutated in patients impaired in the reverse transport of cholesterol, and most particularly in patients suffering from Tangier disease.

[0024] The ABC (“ATP-binding cassette”) transport proteins constitute a family of proteins which are extremely well conserved during evolution, from bacterial to humans.

[0025] The ABC transport proteins are involved in the membrane transport of various substrates, for example ions, amino acids, peptides, sugars, vitamins or steroid hormones.

[0026] The characterization of the complete amino acid sequence of some ABC transporters has made it possible to determine that these proteins had a common general structure, in particular two nucleotide binding folds (NBF) with Walker A and B type units as well as two transmembrane domains, each of the transmembrane domains consisting of six helices. The specificity of the ABC transporters for the various transported molecules appears to be determined by the structure of the transmembrane domains, whereas the energy necessary for the transport activity is provided by the degradation of ATP at the level of the NBF fold.

[0027] Several ABC transport proteins which have been identified in humans have been associated with various diseases.

[0028] For example, cystic fibrosis is caused by mutations in the CFTR (cystic fibrosis transmembrane conductance regulator) gene.

[0029] Moreover, some multiple drug resistance phenotypes in tumor cells have been associated with mutations in the gene encoding the MDR (multi-drug resistance) protein, which also has an ABC transporter structure.

[0030] Other ABC transporters have been associated with neuronal and tumor conditions (patent U.S. Pat. No. 5,858,719) or potentially involved in diseases caused by impairment of the homeostasis of metals, such as the ABC-3 protein.

[0031] Likewise, another transport ABC, designated PFIC2, appears to involve in a progressive familial intrahepatic cholestasia form, this protein being potentially responsible, in humans, for the export of bile salts.

[0032] In 1994, a cDNA encoding a new mouse ABC transporter was identified and designated ABC1 (Luciani et al., 1994). This protein is characteristic of the ABC transporters in that it has a symmetrical structure comprising two transmembrane domains linked to a highly hydrophobic segment and two NBF units.

[0033] In humans, a partial cDNA comprising the entire open reading frame of the human ABC1 transporter has been identified (Langmann et al., 1999).

[0034] It has also been shown that the gene encoding the human ABC1 protein is expressed in various tissues, and more particularly at high levels in the placenta, the liver, the lungs, the adrenal glands as well as the fetal tissues.

[0035] These authors have also shown that the expression of the gene encoding the human ABC1 protein was induced during the differentiation of the monocytes into macrophages in vitro. Furthermore, the expression of the gene encoding the ABC1 protein is increased when the human macrophages are incubated in the presence of acetylated low-density lipoproteins (AcLDLs).

[0036] However, the exact role of the human ABC1 protein in the lipid transport system is completely unknown. It is simply assumed that the ABC1 protein has a translocase activity for phospholipids.

[0037] It has now been shown, according to the invention, that patients suffering from Tangier disease had a mutated ABC1 gene. Several mutations distributed in different exons of the ABC1 gene have been identified in the genome of various patients, in particular patients suffering from a severe form of the disease associated with coronary disorders. Moreover, various polymorphisms have been found both in the exons and in the introns of the ABC1 gene in patients suffering from the mildest forms of the disease, indicating that these patients carry particular alleles of the gene, distinct from the “wild-type” allele(s). Such alleles, partly characterizable by these polymorphisms, are moreover likely to contain substitutions, additions or deletions of nucleotides in the noncoding regions located respectively on the 5′ side of the first exon or alternatively on the 3′ side of the last exon of the gene, in particular in the regulatory regions, for example in the promoter sequences or alternatively in the enhancer sequences, of the type which induces defects—increase or decrease—in the synthesis of the ABC1 polypeptide.

[0038] A first particular mutation has thus been identified in a patient suffering from Tangier disease, in the ABC-1 gene, which is located in exon 13, and which consists of a substitution of a nucleotide causing the introduction of a codon for premature termination of translation into the open reading frame, leading to the synthesis of a truncated polypeptide comprising about a quarter of the amino acid sequence of the polypeptide synthesized in patients not affected by Tangier disease.

[0039] A second particular mutation in the ABC1 gene has been found, which consists in an insertion of a fragment of 100 nucleotides into exon 12, leading to the synthesis of a polypeptide which is abnormal in that it contains a deletion of 6 residues and an insertion of 38 amino acids, at position 468 of the sequence of the protein.

[0040] It has, in addition, been confirmed according to the invention that the ABC1 gene was positively regulated by the acetylated low-density lipoproteins (AcLDLs).

General Definitions

[0041] The term “isolated” for the purposes of the present invention designates a biological material (nucleic acid or protein) which has been removed from its original environment (the environment in which it is naturally present).

[0042] For example, a polynucleotide present in the natural state in a plant or an animal is not isolated. The same polynucleotide separated from the adjacent nucleic acids in which it is naturally inserted in the genome of the plant or animal is considered as being “isolated”.

[0043] Such a polynucleotide may be included in a vector and/or such a polynucleotide may be included in a composition and remains nevertheless in the isolated state because of the fact that the vector or the composition does not constitute its natural environment.

[0044] The term “purified” does not require the material to be present in a form exhibiting absolute purity, exclusive of the presence of other compounds. It is rather a relative definition.

[0045] A polynucleotide is in the “purified” state after purification of the starting material or of the natural material by at least one order of magnitude, preferably 2 or 3 and preferably 4 or 5 orders of magnitude.

[0046] For the purposes of the present description, the expression “nucleotide sequence” may be used to designate either a polynucleotide or a nucleic acid. The expression “nucleotide sequence” covers the genetic material itself and is therefore not restricted to the information relating to its sequence.

[0047] The terms “nucleic acid”, “polynucleotide”, “oligonucleotide” or “nucleotide sequence” cover RNA, DNA or cDNA sequences or alternatively RNA/DNA hybrid sequences of more than one nucleotide, either in the single-chain form or in the duplex form.

[0048] The term “nucleotide” designates both the natural nucleotides (A, T, G, C) as well as the modified nucleotides which comprise at least one modification such as (1) an analog of a purine, (2) an analog of a pyrimidine, or (3) an analogous sugar, examples of such modified nucleotides being described, for example, in the PCT application No. WO 95/04064.

[0049] For the purposes of the present invention, a first polynucleotide is considered as being “complementary” to a second polynucleotide when each base of the first nucleotide is paired with the complementary base of the second polynucleotide whose orientation is reversed. The complementary bases are A and T (or A and U), or C and G.

[0050] “Variant” of a nucleic acid according to the invention will be understood to mean a nucleic acid which differs by one or more bases relative to the reference polynucleotide. A variant nucleic acid may be of natural origin, such as an allelic variant which exists naturally, or it may also be a nonnatural variant obtained, for example, by mutagenic techniques.

[0051] In general, the differences between the reference nucleic acid and the variant nucleic acid are small such that the nucleotide sequences of the reference nucleic acid and of the variant nucleic acid are very similar and, In many regions, identical. The nucleotide modifications present in a variant nucleic acid may be silent, which means that they do not alter the amino acid sequences encoded by said variant nucleic acid.

[0052] However, the changes in nucleotides in a variant nucleic acid may also result in substitutions, additions or deletions in the polypeptide encoded by the variant nucleic acid in relation to the peptides encoded by the reference nucleic acid in addition, nucleotide modifications in the coding regions may produce conservative or nonconservative substitutions in the amino acid sequence.

[0053] Preferably, the variant nucleic acids according to the invention encode polypeptides which substantially conserve the same function or biological activity as the polypeptide of the reference nucleic acid or alternatively the capacity to be recognized by antibodies directed against the polypeptides encoded by the initial nucleic acid.

[0054] Some variant nucleic acids will thus encode mutated forms of the polypeptides whose systematic study will make it possible to deduce structure-activity relationships of the proteins in question. Knowledge of these variants in relation to the disease studied is essential since it makes it possible to understand the molecular cause of the pathology.

[0055] “Fragment” will be understood to mean a reference nucleic acid according to the invention, a nucleotide sequence of reduced length relative to the reference nucleic acid and comprising, over the common portion, a nucleotide sequence identical to the reference nucleic acid.

[0056] Such a nucleic acid “fragment” according to the invention may be, where appropriate, included in a larger polynucleotide of which it is a constituent.

[0057] Such fragments comprise, or alternatively consist of, oligonucleotides ranging in length from 8, 10, 12, 15, 18, 20 to 25, 30, 40, 50, 70, 80, 100, 200, 500, 1000 or 1500 consecutive nucleotides of a nucleic acid according to the invention.

[0058] “Variant” of a polypeptide according to the invention will be understood to mean mainly a polypeptide whose amino acid sequence contains one or more substitutions, additions or deletions of at least one amino acid residue, relative to the amino acid sequence of the reference polypeptide, it being understood that the amino acid substitutions may be either conservative or nonconservative.

[0059] “Fragment” of a polypeptide according to the invention will be understood to mean a polypeptide whose amino acid sequence is shorter than that of the reference polypeptide and which comprises, over the entire portion with these reference polypeptides, an identical amino acid sequence.

[0060] Such fragments may, where appropriate, be included in a larger polypeptide of which they are a part.

[0061] Such fragments of a polypeptide according to the invention may have a length of 10, 15, 20, 30 to 40, 50, 100, 200 or 300 amino acids.

[0062] The “percentage identity” between two nucleotide or amino acid sequences, for the purposes of the present invention, may be determined by comparing two sequences aligned optimally, through a window for comparison.

[0063] The portion of the nucleotide or polypeptide sequence in the window for comparison may thus comprise additions or deletions (for example “gaps”) relative to the reference sequence (which does not comprise these additions or these deletions) so as to obtain an optimum alignment of the two sequences.

[0064] The percentage is calculated by determining the number of positions at which an identical nucleic base or an identical amino acid residue is observed for the two sequences (nucleic or peptide) compared, and then by dividing the number of positions at which there is identity between the two bases or amino acid residues by the total number of positions in the window for comparison, and then multiplying the result by 100 in order to obtain the percentage sequence identity.

[0065] The optimum sequence alignment for the comparison may be achieved using a computer with the aid of known algorithms contained in the package from the company WISCONSIN GENETICS SOFTWARE PACKAGE, GENETICS COMPUTER GROUP (GCG), 575 Science Doctor, Madison, Wis.

[0066] By way of illustration, it will be possible to produce the percentage sequence identity with the aid of the BLAST software (versions BLAST 1.4.9 of March 1996, BLAST 2.0.4 of February 1998 and BLAST 2.0.6 of September 1998), using exclusively the default parameters (S. F Altschul et al, J. Mol. Biol. 1990 215 403-410, S. F Altschul et al, Nucleic Acids Res. 1997 25: 3389-3402). Blast searches for sequences similar/homologous to a reference “request” sequence, with the aid of the Altschul et al. algorithm. The request sequence and the databases used may be of the peptide or nucleic types, any combination being possible.

[0067] “High stringency hybridization conditions” for the purposes of the present invention will be understood to mean the following conditions:

[0068] 1—Membrane Competition and Prehybridization:

[0069] Mix: 40 μl salmon sperm DNA (10 mg/ml)+40 μl human placental DNA (10 mg/ml)

[0070] Denature for 5 min at 96° C., then immerse the mixture in ice.

[0071] Remove the 2×SSC and pour 4 ml of formamide mix in the hybridization tube containing the membranes.

[0072] Add the mixture of the two denatured DNAs.

[0073] Incubation at 42° C. for 5 to 6 hours, with rotation.

[0074] 2—Labeled Probe Competition:

[0075] Add to the labeled and purified probe 10 to 50 μl Cot I DNA, depending on the quantity of repeats.

[0076] Denature for 7 to 10 min at 95° C.

[0077] Incubate at 65° C. for 2 to 5 hours.

[0078] 3—Hybridization:

[0079] Remove the prehybridization mix.

[0080] Mix 40 μl salmon sperm DNA+40 μl human placental DNA; denature for 5 min at 96° C., then immerse in ice.

[0081] Add to the hybridization tube 4 ml of formamide mix, the mixture of the two DNAs and the denatured labeled probe/Cot I DNA.

[0082] Incubate 15 to 20 hours at 42° C., with rotation.

[0083] 4—Washes:

[0084] One wash at room temperature in 2×SSC, to rinse.

[0085] Twice 5 minutes at room temperature 2×SSC and 0.1% SDS at 65° C.

[0086] Twice 15 minutes at 65° C. 1×SSC and 0.1% SDS at 65° C.

[0087] Envelope the Membranes in Saran and Expose.

[0088] The hybridization conditions described above are adapted to hybridization, under high stringency conditions, of a molecule of nucleic acid of varying length from 20 nucleotides to several hundreds of nucleotides.

[0089] It goes without saying that the hybridization conditions described above may be adjusted as a function of the length of the nucleic acid whose hybridization is sought or of the type of labeling chosen, according to techniques known to persons skilled in the art.

[0090] Suitable hybridization conditions may for example be adjusted according to the teaching contained in the manual by HAMES and HIGGINS (1985) or in the manual by F. AUSUBEL et al (1999).

[0091] Nucleic Acids of the ABC1 Gene Genomic Sequences

[0092] The human ABC1 gene is thought to comprise 48 exons and 47 introns, if reference is made in particular to the structure of the orthologous ABC1 gene in mice.

[0093] Several partial genomic nucleotide sequences of the ABC1 gene have been isolated and characterized according to the invention, these genomic sequences comprising both new exonic sequences and intronic sequences which may be used in particular for the production of various means of detection of the ABC1 gene or of its nucleotide expression products in a sample. These partial genomic sequences are represented in Table 1 below. TABLE I Partial genomic sequences of the human ABC1 gene SEQ ID NO Designation 1 Intron 10(p), exon 11(p) 2 Intron 11(p), exon 12, intron 12, exon 13, intron 13, exon 14, intron 14, exon 15, intron 15, exon 16, intron 16, exon 17(p) 3 Exon 17(p), intron 17(p) 4 Intron 18(p), exon 19, intron 19(p) 5 Intron 19(p), exon 20, intron 20, exon 21, intron 21, exon 22, intron 22, exon 23, intron 23, exon 24, intron 24, exon 25, intron 25, exon 26, intron 26(p) 6 Intron 26(p), exon 27, intron 27, exon 28, intron 28, exon 29, intron 29, exon 30, intron 30(p) 7 Intron 30(p), exon 31, intron 31, exon 32, intron 32, exon 33, intron 33, exon 34, intron 34(p) 8 Intron 34(p), exon 35, intron 35(p) 9 Intron 35(p), exon 36, intron 36(p) 10 Intron 36(p), exon 37, intron 37, exon 38, intron 38(p) 11 Intron 39(p), exon 40, intron 40, exon 41, intron 41(p) 12 Intron 41(p), exon 42, intron 42(p) 13 Intron 46(p), exon 47, intron 47(p) 14 Last exon(p), sequence in 3′ of the last exon

[0094] Thus, a first subject of the invention consists in a nucleic acid comprising at least 245 consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 1-14, or a nucleic acid having a complementary sequence.

[0095] The invention also relates to a nucleic acid having at least 80%, advantageously 90%, preferably 95% and most preferably 98% nucleotide identity with a nucleic acid comprising at least 245 consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 1-14, or a nucleic acid having a complementary sequence.

[0096] Thirty two exons of the ABC1 gene have been characterized, at least partially, by their nucleotide sequence, as indicated in Table II below. TABLE II SEQ ID Located in Position of the Position of the Exon No. NO SEQ ID NO nucleotide in 5′ nucleotide in 3′ 11 (5′ end) 15 1 3003 3153 12 16 2 398 603 13 17 2 1124 1300 14 18 2 3087 3309 15 19 2 5055 5276 16 20 2 6337 6541 17 (5′ end) 21 2 7646 7660 17 (3′ end) 22 3 1 105 19 23 4 904 1035 20 24 5 284 426 21 25 5 630 767 22 26 5 1470 1690 23 27 5 2949 3021 24 28 5 4008 4210 25 29 5 5878 5926 26 30 5 6122 6235 27 31 6 561 709 28 32 6 2359 2483 29 33 6 3714 3812 30 34 6 6848 7036 31 35 7 1183 1277 32 36 7 2587 2619 33 37 7 3744 3849 34 38 7 5323 5397 35 39 8 236 405 36 40 9 989 1166 37 41 10 545 660 38 42 10 772 916 40 43 11 435 564 41 44 11 829 949 42 45 12 589 651 47 46 13 377 620 Last exon 47 14 1 1237 (3′ end)

[0097] Thus, the invention also relates to a nucleic acid comprising a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 15-47, or a nucleic acid having a complementary sequence.

[0098] Moreover, thirty-five introns of the ABC1 gene have been isolated and characterized, at least partially. The nucleotide sequences of the introns of the ABC1 gene, as well as their fragments and their variants may also be used as nucleotide probes or primers for detecting the presence of at least one copy of the ABC1 gene in a sample, or alternatively for amplifying a given target sequence in the ABC1 gene.

[0099] The references to the intronic sequences of the ABC1 gene are indicated in Table III below. TABLE III SEQ ID Located in Position of the Position of the Intron No. NO SEQ ID NO nucleotide in 5′ nucleotide in 3′ 10 (3′ end) 48 1 1 3002 11 (3′ end) 49 2 1 397 12 50 2 604 1123 13 51 2 1301 3086 14 52 2 3310 5054 15 53 2 5277 6336 16 54 2 6542 7645 17 (3′ end) 55 3 106 1285 18 (3′ end) 56 4 1 903 19 (5′ end) 57 4 1036 1521 19 (3′ end) 58 5 1 283 20 59 5 427 629 21 60 5 768 1469 22 61 5 1691 2948 23 62 5 3022 4007 24 63 5 4211 5877 25 64 5 5927 6121 26 (5′ end) 65 5 6236 6519 26 (3′ end) 66 6 1 560 27 67 6 710 2358 28 68 6 2484 3713 29 69 6 3813 6847 30 (5′ end) 70 6 7037 7378 30 (3′ end) 71 7 1 1182 31 72 7 1278 2586 32 73 7 2620 3743 33 74 7 3850 5322 34 (5′ end) 75 7 5398 5689 34 (3′ end) 76 8 1 235 35 (5′ end) 77 8 406 645 35 (3′ end) 78 9 1 988 36 (5′ end) 79 9 1167 1664 36 (3′ end) 80 10 1 544 37 81 10 661 771 38 (5′ end) 82 10 917 1279 39 (3′ end) 83 11 1 434 40 84 11 565 828 41 (5′ end) 85 11 950 1124 41 (3′ end) 86 12 1 588 42 (5′ end) 87 12 652 729 46 (3′ end) 88 13 1 376 47 (5′ end) 89 13 621 731 Distal 90 14 1238 3501 sequence in 3′ of the last exon

[0100] The invention also relates to a nucleic acid comprising at least 8 consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 48-89, or a nucleic acid having a complementary sequence.

[0101] The subject of the invention is, in addition, a nucleic acid having at least 80% nucleotide identity with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 48-89, or a nucleic acid having a complementary sequence.

[0102] The invention also relates to a nucleic acid hybridizing, under high stringency conditions, with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 48-89, or a nucleic acid having a complementary sequence.

[0103] In addition, a potentially regulatory genomic nucleotide sequence located downstream of the 3′ end of the last exon of the ABC1 gene has been isolated. It is a polynucleotide having the sequence SEQ ID NO 90. The characterization of polymorphisms in this potentially regulatory sequence (possible presence of regulatory sequences of the “enhancer” type), in particular in patients suffering from mild forms of deficiency in the reverse transport of cholesterol, in particular mild forms of Tangier disease, would be of the type allowing the production of appropriate means of detection, probes or primers, specific for some of these polymorphisms capable of inducing defects in the regulation of the expression of the ABC1 gene.

[0104] In order to identify the biologically active polynucleotide fragments of the sequence SEQ ID NO 90, persons skilled in the art can advantageously refer to the book by Sambrook et al. (1989) which describes the use of a recombinant vector carrying a marker gene (for example β-galactosidase, chloramphenicol acetyl transferase and the like) whose expression may be detected when this marker gene is placed under the control of a suitable promoter and of a biologically active fragment of the polynucleotide having the sequence SEQ ID NO 90. Such biologically active fragments of the sequence SEQ ID NO 90 may be in particular cloned into appropriate selection vectors having regulatory sequences, such as one of the vectors pSEAP-Basic, pSEAP-Enhancer, pβgal-Basic, pβgal-Enhancer, or pEGFP-1, marketed by the company Clontech.

[0105] The subject of the invention is, in addition, a nucleic acid having at least 80% nucleotide identity with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 90, or a nucleic acid having a complementary sequence.

[0106] The invention also relates to a nucleic acid hybridizing, under high stringency conditions, with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 90, or a nucleic acid having a complementary sequence.

[0107] Complete cDNA

[0108] As already indicated above, a partial cDNA sequence corresponding to the expression of the ABC1 gene has been identified by Langman et al. (1999). This partial cDNA sequence of ABC1 comprises 6880 nucleotides and contains the entire open reading frame corresponding to the ABC1 protein produced in subjects not affected by disorders linked to the reverse transport of cholesterol. The cDNA sequence described by Langmann et al. (1999) contains, in addition, a portion of the 5′-UTR region (nucleotides 1 to 120) and a portion of the 3′-UTR region (nucleotides 6727 to 6880).

[0109] The entire complete cDNA corresponding to the ABC1 gone, which corresponds to a new 3′-UTR region, which constitutes a major nucleic region, in particular from the point of view of the stability of the messenger RNAs in the cell, has now been isolated and characterized according to the invention.

[0110] The analyses of expression of the transcript having the sequence SEQ ID NO 91 were carried out by RT-PCR, as described in Example 1. These analyses carried out starting with the polyA+ RNA of various tissues have made it possible to ensure that the ABC1 gene was expressed in the fetal brain, the brain, the heart, the placenta and the uterus.

[0111] Consequently, the invention also relates to a nucleic acid comprising a polynucleotide having the sequence SEQ ID NO 91 of the cDNA of the human ABC1 gene, or a nucleic acid having a complementary sequence.

[0112] The cDNA of the human ABC1 gene having the sequence SEQ ID NO 91 comprises an open reading frame going from the nucleotide at position 121 (base A of the ATG codon for initiation of translation) to the nucleotide at position 6723 of the sequence SEQ ID NO 91. A polyadenylation signal (having the sequence ATTAAA) is present, starting at the nucleotide at position 9454 of the sequence SEQ ID NO 91.

[0113] The cDNA having the sequence SEQ ID NO 91 encodes the ABC1 polypeptide of 2201 amino acids in length, and having the amino acid sequence SEQ ID NO 139.

[0114] The invention also relates to a nucleic acid comprising at least eight consecutive nucleotides of a polynucleotide having the sequence SEQ ID NO 92, a biologically active fragment thereof or a nucleic acid having a complementary sequence.

[0115] The subject of the invention is also a nucleic acid having at least 80% nucleotide identity with a polynucleotide having the sequence SEQ ID NO 92, a biologically active fragment thereof or a nucleic acid having a complementary sequence.

[0116] Another subject of the invention consists in a nucleic acid hybridizing, under high stringency conditions, with a polynucleotide having the sequence SEQ ID NO 92, a biologically active fragment thereof or a nucleic acid having a complementary sequence.

[0117] Polymorphisms within the ABC1 Gene Mutations

[0118] According to the invention, several mutations have been identified in the sequence of the ABC1 gene, these mutations leading to major structural impairments of the ABC1 polypeptide encoded by the mutated sequences. These mutations have been found particularly in patients suffering from severe forms of Tangier disease, associated with serious coronary disorders. Two particularly deleterious mutations are described below.

[0119] 1. Mutation in Exon 12

[0120] This mutation consists both in a deletion of a localized segment of 14 nucleotides (“TGAGAGGAAGTTCT”) from the nucleotide at position 472 to the nucleotide at position 485 of the normal genomic DNA having the sequence SEQ ID NO 2 and in an insertion of an Alu-type sequence of 110 nucleotides into the sequence of exon 12 of the ABC1 gene, upstream of the nucleotide at position 486 of the normal genomic DNA having the sequence SEQ ID NO 2.

[0121] The exon 12 carrying this deletion/insertion mutation has the nucleotide sequence SEQ ID NO 93.

[0122] The corresponding mutated cDNA has the nucleotide sequence SEQ ID NO 94, encodes a mutated ABC1 polypeptide of 2233 amino acids in length, having the sequence SEQ ID NO 140, whose structure is substantially altered compared with the normal ABC1 polypeptide having the sequence SEQ ID NO 139.

[0123] The nucleotide sequences SEQ ID NO 93 and 94 as well as the polypeptide sequence SEQ ID NO 140 also form part of the invention.

[0124] 2 Mutation in Exon 13

[0125] This mutation consists of a deletion of the nucleotide (G) at position 1232 of the genomic sequence SEQ ID NO 2, which is located in exon 13 (nucleotide G at position 106 of the sequence of exon 13 SEQ ID NO 17). This point deletion of one base introduces a stop codon in the normal reading frame in the mRNA of the ABC 1 gene.

[0126] The sequence of exon 13 of the ABC1 gene carrying this mutation is the polynucleotide having the sequence SEQ ID NO 95.

[0127] The cDNA corresponding to this mutation in exon 13 of the ABC1 gene is represented in the nucleotide sequence SEQ ID NO 96.

[0128] The mutated protein encoded by the mutated ABC1 gene having a length of 574 amino acids, that is to say about a quarter of the length in terms of amino acids of the normal protein. The truncated polypeptide has the amino acid sequence SEQ ID NO 141.

[0129] The structural characteristics which make it possible to differentiate the normal sequences from the mutated sequences of ABC1 (genomic sequences, messenger RNAs, cDNA) may be exploited in order to produce means of detection of the mutated sequences of ABC1 in a sample, in particular probes specifically hybridizing with the mutated sequences of ABC1 or pairs of primers making it possible to selectively amplify the regions of the ABC1 gene carrying the mutations described above, it being possible to carry out the detection of the presence of these mutations in particular by distinguishing the length of the amplified nucleic acid fragments, by hybridization of the amplified fragments with the aid of the specific probes described above, or by direct sequencing of these amplified fragments.

[0130] Thus, a further subject of the invention is a nucleic acid having at least eight consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 93-96, or a nucleic acid having a complementary sequence.

[0131] Preferably, such a nucleic acid comprises:

[0132] a) either at least two consecutive nucleotides of the Alu sequence located in the sequences SEQ ID NO 93 and 94, preferably 5, 10, 15, 20, 25, 30, 35, 40, 50 or 100 consecutive nucleotides of the Alu sequence located in the sequences SEQ ID NO 93 and 94;

[0133] b) or at least the two “CT” nucleotides situated on either side of the deleted G base, in the sequences SEQ ID NO 94 and 95.

[0134] The primers hybridizing with a nucleic sequence located in the region of an ABC1 sequence (genomic sequence, messenger RNA) carrying either of the two mutations described above also form part of the invention.

[0135] The invention relates, in addition, to a nucleic acid having at least 80% nucleotide identity with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 93-96, or a nucleic acid having a complementary sequence.

[0136] The invention also relates to a nucleic acid hybridizing, under high stringency conditions, with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 93-96, or a nucleic acid having a complementary sequence.

[0137] Other Polymorphisms

[0138] Other polymorphisms have been found within the sequence of the ABC1 gene, in particular nucleotide substitutions located both in the coding regions (exons) and in the noncoding regions.

[0139] As regards the polymorphisms found in the coding regions, they are essentially substitutions of a single nucleotide located on the third base of the codons of the open reading frame of ABC1, these substitutions causing no modification as regards the nature of the amino acid encoded, taking into account the rules of genetic degeneration in humans, which are well known to persons skilled in the art.

[0140] These polymorphisms are represented in the present description in the form of nucleotide sequences of 41 bases in length, the polymorphic base being located at the center of the polymorphic fragment. For each of the polymorphisms identified, each allele is thus represented as a sequence of 41 bases, the polymorphism itself being defined by the two nucleotide sequences corresponding respectively to each of the forms. The polymorphisms identified in the ABC1 gene are represented in Table IV below. TABLE IV Polymorphisms found in the ABC1 gene Position in the Allele 2 Polymorphic Designation sequence Allele 1 SEQ base No. SEQ ID NO 2 SEQ ID NO ID NO Allele1/Allele 2 1 397 97 98 G/A 2 1324 99 100 T/A 3 3028 101 102 C/T 4 3234 103 104 C/A 5 3390 105 106 A/G 6 6854 107 108 G/A

[0141] The detection of these polymorphisms within a DNA sample obtained from a subject may, for example, be carried out by a specific amplification of the nucleotide region of ABC1 containing the polymorphic base, and then sequencing the amplified fragment in order to determine the nature of the allele or of the alleles carried by said subject.

[0142] The detection of these polymorphisms in a DNA sample obtained from a subject may also be carried out with the aid of nucleotide probes or primers specifically hybridizing with a given allele containing one of the polymorphic bases of a polymorphism of the ABC1 gene according to the invention.

[0143] By way of illustration, appropriate nucleotide primers are for example primers whose base at the 3′ end hybridizes with the base located immediately on the 5′ side of the polymorphic base of the fragment comprising said polymorphism. After the step of hybridization of the specific primer, a step of extension with a mixture of the two dideoxynucleotides complementary to the polymorphic base of said polymorphism, for example differentially labeled by fluorescence, and then a step detection of the fluorescence signal obtained makes it possible to determine which of the two differentially labeled fluorescent dideoxynucleotides has been incorporated and to directly deduce the nature of the polymorphic base present at the level of this polymorphism.

[0144] Various approaches may be used for the labeling and detection of the dideoxynucleotides. A method in homogeneous phase based on FRET (“Fluorescence resonance energy transfer”) has been described by Chen and Kwok (1997). According to this method, the amplified fragments of genomic DNA containing polymorphisms are incubated with a primer labeled with fluorescein at the 5′ end in the presence of labeled dideoxynucleotide triphosphate and a modified Taq polymerase. The labeled primer is extended by one base by incorporation of the labeled dideoxynucleotide specific for the allele present on the complementary genomic DNA sequence. At the end of this genotyping reaction, the fluorescence intensities for the two labeling compounds for the labeled dideoxynucleotides are directly analyzed without separation or purification. All these steps may be carried out in the same tube and the modifications of the fluorescence signal monitored in real time. According to another embodiment, the extended primer may be analyzed by MALDI-TOF type mass spectrometry. The base located at the level of the polymorphic site is identified by measuring the mass added to the microsequencing primer (Haff and Smirnov, 1997).

[0145] Such nucleotide primers may, for example, be demobilized on a support. Furthermore, it is possible to immobilize on a support, for example in an orderly manner, multiple specific primers as described above, each of the primers being suited to the detection of one of the polymorphisms of the ABC1 gene according to the invention.

[0146] The polymorphisms of the ABC1 gene according to the invention are useful in particular as genetic markers in studies of association between the presence of a given allele in a subject and the predisposition of this subject to a given pathology, in particular to one of the pathologies already associated with the chromosomal region 9q31 preferably with a pathology linked to a dysfunction in the reverse transport of cholesterol.

[0147] The methods for the genetic analysis of complex characters (phenotypes) are of various types (Lander and Schork, 1994). In general, the bialleleic polymorphisms according to the invention are useful in any of the methods described in the state of the art intended to demonstrate a statistically significant correlation between a genotype and a phenotype. The bialleleic polymorphisms may be used in linkage analyses and in allele sharing methods. Preferably, the bialleleic polymorphisms according to the invention are used to identify genes associated with detectable characters (phenotypes) in use for studies of association, an approach which does not require the use of families affected by the character, and which allows, in addition, the identification of genes associated with complex and sporadic characters.

[0148] Other statistical methods using bialleleic polymorphisms according to the invention are for example those described by Forsell et al. (1997), Xiong et al. (1999), Horvath et al. (1998), Sham et al. (1995) or Nickerson et al. (1992).

[0149] According to another aspect, the invention also relates to the nucleotide sequences of the ABC1 gene comprising at least one bialleleic polymorphism as described above.

[0150] Thus, the invention also relates to a nucleic acid having at least eight consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 97-108 and comprising the polymorphic base, or a nucleic acid having a complementary sequence.

[0151] Nucleotide Probes and Primers

[0152] The nucleic acid fragments derived from any one of the nucleotide sequences SEQ ID NO 1-14, 15-47, 48-89, 90, 92, 94-96 and 97-108 are useful for the detection of the presence of at least one copy of a nucleotide sequence of the ABC1 gene or of a fragment or of a variant (containing a mutation or a polymorphism) thereof in a sample.

[0153] The nucleotide probes or primers according to the invention comprise at least 8 consecutive nucleotides of a nucleic acid chosen from the group consisting of the sequences SEQ ID NO 1-14, 15-47, 48-89, 90, 92, 93-96 and 97-108, or of a nucleic acid having a complementary sequence.

[0154] Preferably, nucleotide probes or primers according to the invention will have a length of 10, 12, 15, 18 or 20 to 25, 35, 40, 50, 70, 80, 100, 200, 500, 1000, 1500 consecutive nucleotides of a nucleic acid according to the invention, in particular a nucleic acid having a nucleotide sequence chosen from the sequences SEQ ID NO 1-14, 15-47, 48-89, 90, 92, 93-96 and 97-108, or of a nucleic acid having a complementary sequence.

[0155] Alternatively, a nucleotide probe or primer according to the invention will consist of and/or comprise the fragments having a length of 12, 15, 18, 20, 25, 35, 40, 50, 100, 200, 500, 1000, 1500 consecutive nucleotides of a nucleic acid according to the invention, more particularly of a nucleic acid chosen from the sequences SE ID NO 1-14, 15-47, 48-89, 90, 92, 93-96 and 97-108, or of a nucleic acid having a complementary sequence.

[0156] The definition of a nucleotide probe or primer according to the invention therefore covers oligonucleotides which hybridize, under the high stringency hybridization conditions defined above, with a nucleic acid chosen from the sequences SEQ ID NO 1-14, 15-47, 48-89, 90, 92, 93-96 and 97-108 or with a sequence complementary thereto.

[0157] Examples of primers and pairs of primers which make it possible to amplify various regions of the ABC1 gene are presented In Table V below. TABLE V Primers for the amplification of nucleic fragments of the ABC1 gene Sequence Located in Position in the of the Region for Primer No. SEQ ID sequence primer hybridization 1 2 313-335 109 Intron 11 2 2 Comp 640-663 110 Intron 12 3 2 1005-1029 111 Intron 12 4 2 Comp 1472-1496 112 Intron 13 5 2 2930-2954 113 Intron 13 6 2 Comp 3444-3468 114 Intron 14 7 2 4988-5012 115 Intron 14 8 2 Comp 5338-5362 116 Intron 15 9 2 6240-6262 117 Intron 15 10 2 Comp 6581-6603 118 Intron 16 11 5 1369-1391 119 Intron 21 12 5 Comp 1748-1770 120 Intron 22 13 5 3868-3890 121 Intron 23 14 5 Comp 4240-4262 122 Intron 24 15 6 3587-3610 123 Intron 28 16 6 Comp 3881-3903 124 Intron 29 17 6 6753-6775 125 Intron 29 18 6 Comp 7112-7134 126 Intron 30 19 7 1060-1082 127 Intron 30 20 7 Comp 1377-1399 128 Intron 31 21 7 3574-3596 129 Intron 32 22 7 Comp 3909-3931 130 Intron 33 23 7 5161-5183 131 Intron 33 24 7 Comp 5463-5485 132 Intron 34 25 8 100-122 133 Intron 34 26 8 Comp 475-497 134 Intron 35 27 9 841-861 135 Intron 35 28 9 Comp 1249-1271 136 Intron 36 29 10 455-477 137 Intron 36 30 10 Comp 966-988 138 Intron 38

[0158] According to a first embodiment of preferred probes and primers according to the invention, they comprise all or part of a polynucleotide chosen from the nucleotide sequences SEQ ID NO 109-138, or nucleic acids having a complementary sequence.

[0159] A nucleotide primer or probe according to the invention may be prepared by any suitable method well known to persons skilled in the art, including by cloning and action of restriction enzymes or by direct chemical synthesis to techniques such as the phosphodiester method by Narang et al. (1979) or by Brown et al. (1979), the diethylphosphoramidite method by Beaucage et al. (1980) or the technique on a solid support described in EU patent. EP 0,707,592.

[0160] Each of the nucleic acids according to the invention, including the oligonucleotide probes and primers described above, may be labeled, if desired, by incorporating a marker which can be detected by spectroscopic, photochemical, biochemical, immunochemical or chemical means.

[0161] For example, such markers may consist of radioactive isotopes (32P, 33P, 3H, 35S), fluorescent molecules (5-bromodeoxyuridine, fluorescein, acetylaminofluorene, digoxigenin) or ligands such as biotin.

[0162] The labeling of the probes is preferably carried out by incorporating labeled molecules into the polynucleotides by primer extension, or alternatively by addition to the 5′ or 3′ ends.

[0163] Examples of nonradioactive labeling of nucleic acid fragments are described in particular in French patent No.78 109 75 or in the articles by Urdea et al. (1988) or Sanchez-pescador et al. (1988).

[0164] Advantageously, the probes according to the invention may have structural characteristics of the type to allow amplification of the signal, such as the probes described by Urdea et al. (1991.) or alternatively in European patent No. EP-0,225,807 (CHIRON).

[0165] The oligonucleotide probes according to the invention may be used in particular in Southern-type hybridizations with the genomic DNA or alternatively in hybridizations with the corresponding messenger RNA when the expression of the corresponding transcript is sought in a sample.

[0166] The probes according to the invention may also be used for the detection of products of PCR amplification or alternatively for the detection of mismatches.

[0167] Nucleotide probes or primers according to the invention may be immobilized on a solid support. Such solid supports are well known to persons skilled in the art and comprise surfaces of wells of microtiter plates, polystyrene beds, magnetic beds, nitrocellullose bands or microparticles such as latex particles.

[0168] Consequently, the present invention also relates to a method of detecting the presence of a nucleic acid as described above in a sample, said method comprising the steps of:

[0169] 1) bringing one or more nucleotide probes according to the invention into contact with the sample to be tested;

[0170] 2) detecting the complex which may have formed between the probe(s) and the nucleic acid present in the sample.

[0171] According to a specific embodiment of the method of detection according to the invention, the oligonucleotide probes are immobilized on a support.

[0172] According to another aspect, the oligonucleotide probes comprise a detectable marker.

[0173] The invention relates, in addition, to a box or kit for detecting the presence of a nucleic acid according to the invention in a sample, said box comprising:

[0174] a) one or more nucleotide probes as described above;

[0175] b) where appropriate, the reagents necessary for the hybridization reaction.

[0176] According to a first aspect, the detection box or kit is characterized in that the probe(s) are immobilized on a support.

[0177] According to a second aspect, the detection box or kit is characterized in that the oligonucleotide probes comprise a detectable marker.

[0178] According to a specific embodiment of the detection kit described above, such a kit will comprise a plurality of oligonucleotide probes in accordance with the invention which may be used to detect target sequences of interest or alternatively to detect mutations in the coding regions or the noncoding regions of the nucleic acids according to the invention, more particularly of the nucleic acids having the sequences SEQ ID NO 1-14, 15-47, 48-89, 90, 92, 93-96 and 97-108 or the nucleic acids having a complementary sequence.

[0179] Thus, the probes according to the invention, immobilized on a support, may be ordered into matrices such as “DNA chips”. Such ordered matrices have in particular been described in U.S. Pat. No. 5,143,854, in PCT applications No. WO 90/150 70 and 92/10092.

[0180] Support matrices on which oligonucleotide probes have been immobilized at a high density are for example described in U.S. Pat. No. 5,412,087 and in PCT application No. WO 95/11995.

[0181] The nucleotide primers according to the invention may be used to amplify any one of the nucleic acids according to the invention, and more particularly all or part of a nucleic acid having the sequences SEQ ID NO 1-14, 15-47, 48-89, 90, 92, 93-96 and 97-108, or alternatively a variant thereof.

[0182] Another subject of the invention relates to a method of amplifying a nucleic acid according to the invention, and more particularly a nucleic acid having the sequences SEQ ID NO 1-14, 15-47, 48-89, 90, 92, 93-96 and 97-108 or a fragment or a variant thereof contained in a sample, said method comprising the steps of:

[0183] a) bringing the sample in which the presence of the target nucleic acid is suspected into contact with a pair of nucleotide primers whose hybridization position is located respectively on the 5′ side and on the 3′ side of the region of the target nucleic acid whose amplification is sought, in the presence of the reagents necessary for the amplification reaction; and

[0184] b) detecting the amplified nucleic acids.

[0185] To carry out the amplification method as defined above, use will be advantageously made of any of the nucleotide primers described above.

[0186] The subject of the invention is, in addition, a box or kit for amplifying a nucleic acid according to the invention, and more particularly all or part of a nucleic acid having the sequences SEQ ID NO 1-14, 15-47, 48-89, 90, 92, 93-96 and 97-108, said box or kit comprising:

[0187] a) a pair of nucleotide primers in accordance with the invention, whose hybridization position is located respectively on the 5′ side and 3′ side of the target nucleic acid whose amplification is sought;

[0188] b) where appropriate, the reagents necessary for the amplification reaction.

[0189] Such an amplification box or kit will advantageously comprise at least one pair of nucleotide primers as described above.

[0190] According to a first preferred embodiment, primers according to the invention comprise all or part of a polynucleotide chosen from the nucleotide sequences SEQ ID NO 109 and 110, making it possible to amplify the region of exon 12 of the ABC1 gene carrying the first mutation (deletion/insertion) described above, or nucleic acids having a complementary sequence.

[0191] According to a second preferred embodiment, primers according to the invention comprise all or part of a polynucleotide chosen from the nucleotide sequences SEQ ID NO 111 and 112, making it possible to amplify the region of exon 13 of the ABC1 gene carrying the second mutation (deletion of a G base) described above, or nucleic acids having a complementary sequence.

[0192] According to a third preferred embodiment, primers according to the invention comprise, generally, all or part of a polynucleotide chosen from the nucleotide sequences SEQ ID NO 109-138, or nucleic acids having a complementary sequence.

[0193] According to a fourth preferred embodiment, the invention also relates to nucleotide primers comprising at least 15 consecutive nucleotides of a nucleic acid chosen from the group consisting of the sequences SEQ ID NO 97-108 or a nucleic acid having a complementary sequence, the base of the 3′ end of these primers being complementary to the nucleotide located immediately on the 5′ side of the polymorphic base of one of the sequences SEQ ID NO 97-108 or of their complementary sequences.

[0194] According to another aspect, the invention also relates to nucleotide primers comprising at least 15 consecutive nucleotides of a nucleic acid chosen from the group consisting of the sequences SEQ ID NO 97-108 or a nucleic acid having a complementary sequence, the base of the 3′ end of these primers being complementary to a nucleotide situated at 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 nucleotides or more on the 5′ side of the polymorphic base of one of the sequences SEQ ID NO 97-108 or of their complementary sequences. To construct primers whose nucleotide at the 3′ end is complementary to a nucleotide located at more than 20 nucleotides on the 5′ side of the polymorphic base of one of the sequences SEQ ID NO 97-108, persons skilled in the art will advantageously refer to the corresponding genomic sequence among the sequences SEQ ID NO 1-14 or SEQ ID NO 15-47 and 48-90, comprising the polymorphism for which the nature of the allele is sought.

[0195] Such primers are particularly useful in the context of methods of genotyping subjects and/or of genotyping populations, in particular in the context of studies of association between particular allele forms or particular forms of groups of alleles (haplotypes) in subjects and the existence of a particular phenotype (character) in these subjects, for example the predisposition of these subjects to develop diseases linked to a deficiency in the reverse transport of cholesterol, or alternatively the predisposition of these subjects to develop a pathology whose candidate chromosomal region is situated on chromosome 9, more precisely on the 9q arm and still more precisely in the 9q31 locus.

[0196] Recombinant Vectors

[0197] The invention also relates to a recombinant vector comprising a nucleic acid according to the invention.

[0198] Advantageously, such a recombinant vector will comprise a nucleic acid chosen from the following nucleic acids:

[0199] a) a nucleic acid having the sequence SEQ ID NO 92 or a biologically active fragment thereof;

[0200] b) a nucleic acid comprising a polynucleotide having the sequence SEQ ID NO 91, 94 or 96;

[0201] c) a nucleic acid comprising a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 15-47 and 48-90

[0202] d) a nucleic acid having at least 80% nucleotide identity with a nucleic acid chosen from the group consisting of the sequences SEQ ID N01547 and 48-90 or a fragment or a variant thereof;

[0203] d) a nucleic acid hybridizing, under high stringency hybridization conditions, with a nucleic acid having the sequences SEQ ID NO 15-47 and 48-90, or a fragment or a variant thereof.

[0204] “Vector” for the purposes of the present invention will be understood to mean a circular or linear DNA or RNA molecule which is either in single-stranded or double-stranded form.

[0205] According to a first embodiment, a recombinat vector according to the invention is used to amplify the nucleic acid which is inserted therein after transformation or transfection of the desired cellular host.

[0206] According to a second embodiment, it corrresponds to expression vectors comprising, in addition to a nucleic acid in accordance with the invention, regulatory sequences which make it possible to direct the transcription and/or translation thereof.

[0207] According to an advantageous embodiment, a recombinant vector according to the invention will comprise in particular the following elements:

[0208] (1) elements for regulating the expression of the nucleic acid to be inserted, such as promoters and enhancer sequences;

[0209] (2) the coding sequence contained in the nucleic acid in accordance with the invention to be inserted into such a vector, said coding sequence being placed in phase with the regulatory signals described in (1); and

[0210] (3) appropriate sequences for initiation and termination of the transcription.

[0211] In addition, the recombinant vectors according to the invention may include one or more origins for replication in the cellular hosts in which their amplification or their expression is sought, markers or selectable markers.

[0212] By way of example, the bacterial promoters may be the LacI or LacZ promoters, the T3 or T7 bacteriophage RNA polymerase promoters, the lambda phage PR or PL promoters.

[0213] The promoters for eukaryotic cells will comprise the HSV virus thymidine kinase promoter or alternatively the mouse metallothionein-L promoter.

[0214] Generally, for the choice of a suitable promoter, persons skilled in the art can advantageously refer to the book by Sambrook et al. (1989) cited above or to the techniques described by Fuller et al. (1996).

[0215] When the expression of the genomic sequence of the ABC1 gene will be sought, use will preferably be made of the vectors capable of containing large insertion sequences. In this particular embodiment, bacteriophage vectors such as the P1 bacteriophage vectors such as the vector p158 or the vector p158/neo8 described by Sternberg (1992, 1994) will be preferably used.

[0216] The preferred bacterial vectors according to the invention are for example the vectors pBR322(ATCC37017) or alternatively vectors such as pAA223-3 (Pharmacia, Uppsala, Sweden), and pGEM1 (Promega Biotech, Madison, Wis., UNITED STATES).

[0217] There may also be cited other commercially available vectors such as the vectors pQE70, pQE60, pQE9 (Qiagen), psiX174, pBluescript SA, pNH8A, pNH16A, pNH18A, pNH46A, pWLNEO, pSV2CAT, pOG44, pXTI, pSG(Stratagene).

[0218] They may also be vectors of the baculovirus type such as the vector pVL1392/1393 (Pharmingen) used to transfect cells of the Sf9 line (ATCC No. CRL 1711) derived from Spodoptera frugiperda.

[0219] They may also be adenoviral vectors such as the human adenovirus of type 2 or 5.

[0220] A recombinant vector according to the invention may also be a retroviral vector or an adeno-associated vector (AAV). Such adeno-associated vectors are for example described by Flotte et al. (1992), Samulski et al. (1989), or McLaughlin BA et al. (1996).

[0221] To allow the expression of the polynucleotides according to the invention, the latter must be introduced into a host cell. The introduction of the polynucleotides according to the invention into a host cell may be carried out in vitro, according to the techniques well known to persons skilled in the art for transforming or transfecting cells, either in primer culture, or in the form of cell lines. It is also possible to carry out the introduction of the polynucleotides according to the invention in vivo or ex vivo, for the prevention or treatment of diseases linked to a deficiency in the reverse transport of cholesterol.

[0222] To introduce the polynucleotides or the vectors into a host cell, persons skilled in the art can advantageously refer to various techniques, such as the calcium phosphate precipitation technique (Graham et al., 1973; Chen et al., 1987), DEAE Dextran (Gopal,. 1985), electroporation (Tur-Kaspa, 1896; Potter et al., 1984), direct microinjection (Harland et al., 1985), liposomes charged with DNA (Nicolau et al., 1982, Fraley et al., 1979).

[0223] Once the polynucleotide has been introduced into the host cell, it may be stably integrated into the genome of the cell. The intregration may be achieved at a precise site of the genome, by homologous recombination, or it may be randomly integrated in some embodiments, the polynucleotide may be stably maintained in the host cell in the form of an episome fragment, the episome comprising sequences allowing the retention and the replication of the latter, either independently, or in a synchronized manner with the cell cycle.

[0224] According to a specific embodiment, a method of introducing a polynucleotide according to the invention into a host cell, in particular a host cell obtained from a mammal, in-vivo, comprises a step during which a preparation comprising a pharmaceutically compatible vector and a “naked” polynucleotide according to the invention, placed under the control of appropriate regulatory sequences, is introduced by local injection at the level of the chosen tissue, for example a smooth muscle tissue, the “naked” polynucleotide being absorbed by the cells of this tissue.

[0225] Compositions for use in vitro and in vivo comprising “naked” polynucleotides are for example described in PCT Application No. WO 95/11307 (Institut Pasteur, Inserm, University of Ottawa) as well as in the articles by Tacson et al. (1996) and Huygen et al. (1996).

[0226] According to a specific embodiment of the invention, a composition is provided for the in vivo production of the ABC1 protein. This composition comprises a polynucleotide encoding the ABC1 polypeptide placed under the control of appropriate regulatory sequences, in solution in a physiologically acceptable vector.

[0227] The quantity of vector which is injected into the host organism chosen varies according to the site of the injection. As a guide, there may be injected between about 0.1 and about 100 μg of polynucleotide encoding the ABC1 protein years the body of an animal, preferably of a patient likely to develop a disease linked to a deficiency in the reverse transport of cholesterol or who has already developed this disease, in particular a patient having a predisposition to Tangier disease or who has already developed the disease.

[0228] Consequently, the invention also relates to a pharmaceutical composition intended for the prevention of or treatment of subjects affected by, a dysfunction in the reverse transport of cholesterol, comprising a nucleic acid encoding the ABC1 protein, in combination with one or more physiologically compatible excipients.

[0229] Advantageously, such a composition will comprise the polynucleotide having the sequence SEQ ID NO 91, placed under the control of appropriate regulatory elements.

[0230] The subject of the invention is, in addition, a pharmaceutical composition intended for the prevention of or treatment of subjects affected by, a dysfunction in the reverse transport of cholesterol, comprising a recombinant vector according to the invention, in combination with one or more physiologically compatible excipients.

[0231] The invention also relates to the use of a nucelic acid according to the invention, encoding the ABC1 protein, for the manufacture of a medicament intended for the prevention of Atherosclerosis in various forms or more particularly for the treatment of subjects affected by a dysfunction in the reverse transport of cholesterol.

[0232] The invention also relates to the use of a recombinant vector according to the invention, comprising a nucleic acid encoding the ABC1 protein, for the manufacture of a medicament intended for the prevention of Atherosclerosis in various forms or more particularly for the treatment of subjects affected by a dysfunction in the reverse transport of cholesterol.

[0233] Vectors Useful in Methods of Somatic Gene Therapy and Compositions Containing Such Vectors

[0234] The present invention also relates to a new therapeutic approach for the treatment of pathologies linked to the transport of cholesterol. It provides an advantageous solution to the disadvantages of the prior art, by demonstrating the possibility of treating the pathologies linked to the transport of cholesterol by gene therapy, by the transfer and expression in vivo of a gene encoding an ABC1 protein involved in the transport and the metabolism of cholesterol. The invention thus offers a simple means allowing a specific and effective treatment of related pathologies such as, for example, atherosclerosis.

[0235] Gene therapy consists in correcting a deficiency or an abnormality. (mutation, aberrant expression and the like) and in bringing about the expression of a protein of therapeutic interest by introducing genetic information into the affected cell or organ. This genetic information may be introduced either ex vivo into a cell extracted from the organ, the modified cell then being reintroduced into the body, or directly in vivo into the appropriate tissue. In this second case, various techniques exist, among which various transfection techniques involving complexes of DNA and DEAE-dextran (Pagano et al., J. Virol. 1 (1967) 891), of DNA and nuclear proteins (Kaneda et al., Science 243 (1989) 375), of DNA and lipids (Feigner et al., PNAS 84 (1987) 7413), the use of liposomes (Fraley et al., J. Biol. Chem. 255 (1980) 10431), and the like. More recently, the use of viruses as vectors for the transfer of genes has appeared as a promising alternative to these physical transfection techniques. In this regard, various viruses have been tested for their capacity to infect certain cell populations. In particular, the retroviruses (RSV, HMS, MMS, and the like), the HSV virus, the adeno-associated viruses and the adenoviruses.

[0236] The present invention therefore also relates to a new therapeutic approach for the treatment of pathologies linked to the transport of cholesterol, consisting in transferring and in expressing in vivo genes encoding ABC1. In a particularly advantageous manner, the applicant has now found that it is possible to construct recombinant viruses containing a DNA sequence encoding an ABC1 protein involved in the metabolism of cholesterol, to administer these recombinant viruses in vivo, and that this administration allows a stable and effective expression of a biologically active ABC1 protein in vivo, with no cytopathological effect.

[0237] The present invention also results from the demonstration that adenoviruses constitute particularly efficient vectors for the transfer and the expression of the ABC1 gene. In particular, the present invention shows that the use of recombinant adenoviruses as vectors makes it possible to obtain sufficiently high levels of expression of this gene to produce the desired therapeutic effect. Other viral vectors such as retroviruses or adeno-associated viruses (AAV) allowing a stable expression of the gene are also claimed.

[0238] The present invention thus offers a new approach for the treatment and prevention of cardiovascular and neurological pathologies linked to the abnormalities of the transport of cholesterol.

[0239] The subject of the invention is therefore also a defective recombinant virus comprising a nucleic sequence encoding an ABC1 protein involved in the metabolism of cholesterol.

[0240] The invention also relates to the use of such a defective recombinant virus for the preparation of a pharmaceutical composition intended for the treatment and/or for the prevention of cardiovascular diseases.

[0241] The present invention also relates to the use of cells genetically modified ex vivo with a virus as described above, or of producing cells such as viruses, implanted in the body, allowing a prolonged and effective expression in vivo of a biologically active ABC1 protein.

[0242] The present invention shows that it is possible to incorporate a DNA sequence encoding ABC1 into a viral vector, and that these vectors make it possible to effectively express a biologically active, mature form. More particularly, the invention shows that the in vivo expression of ABC1 may be obtained by direct administration of an adenovirus or by implantation of a producing cell or of a cell genetically modified by an adenovirus or by a retrovirus incorporating such a DNA.

[0243] The present invention is particularly advantageous because it makes it possible to induce a controlled expression, and with no harmful effect, of ABC1 in organs which are not normally involved in the expression of this protein. In particular, a significant release of the ABC1 protein is obtained by implantation of cells producing vectors of the invention, or infected ex vivo with vectors of the invention.

[0244] The activity of transport of cholesterol produced in the context of the present invention may be of the human or animal ABC1 type. The nucleic sequence used in the context of the present invention may be a cDNA, a genomic DNA (gDNA), an RNA (in the case of retroviruses) or a hybrid construct consisting, for example, of a cDNA into which one or more introns would be inserted. It may also involve synthetic or semisynthetic sequences. In a particularly advantageous manner, a cDNA or a gDNA is used. In particular, the use of a gDNA allows a better expression in human cells. To allow their incorporation into a viral vector according to the invention, these sequences are advantageously modified, for example by site-directed mutagenesis, in particular for the insertion of appropriate restriction sites. The sequences described in the prior art are indeed not constructed for use according to the invention, and prior adaptations may prove necessary, in order to obtain substantial expressions. In the context of the present invention, the use of a nucleic sequence encoding a human ABC1 protein is preferred. Moreover, it is also possible to use a construct encoding a derivative of these ABC1 proteins. A derivative of these ABC1 proteins comprises, for example, any sequence obtained by mutation, deletion and/or addition relative to the native sequence, and encoding a product retaining the cholesterol transport activity. These modifications may be made by techniques known to a person skilled in the art (see general molecular biological techniques below). The biological activity of the derivatives thus obtained can then be easily determined, as indicated in particular in the examples of the measurement of the efflux of cholesterol from cells. The derivatives for the purposes of the invention may also be obtained by hybridization from nucleic acid libraries, using as probe the native sequence or a fragment thereof.

[0245] These derivatives are in particular molecules having a higher affinity for their binding sites, molecules exhibiting greater resistance to proteases, molecules having a higher therapeutic efficacy or fewer side effects, or optionally new biological properties. The derivatives also include the modified DNA sequences allowing improved expression in vivo.

[0246] In a first embodiment, the present invention relates to a defective recombinant virus comprising a cDNA sequence encoding an ABC1 protein involved in the transport and metabolism of cholesterol In another preferred embodiment of the invention, the DNA sequence is a gDNA sequence.

[0247] The vectors of the invention may be prepared from various types of viruses. Preferably, vectors derived from adenoviruses, adeno-associated viruses (AAV), herpesviruses (HSV) or retroviruses are used. It is most particularly advantageous to use an adenovirus, for direct administration or for the ex vivo modification of cells intended to be implanted, or a retrovirus, for the implantation of producing cells.

[0248] The viruses according to the invention are defective, that is to say that they are incapable of autonomously replicating in the target cell. Generally, the genome of the defective viruses used in the context of the present invention therefore lacks at least the sequences necessary for the replication of said virus in the infected cell. These regions may be either eliminated (completely or partially), or made nonfunctional, or substituted with other sequences and in particular with the nucleic sequence encoding the ABC1 protein. Preferably, the defective virus retains, nevertheless, the sequences of its genome which are necessary for the encapsidation of the viral particles.

[0249] As regards more particularly adenoviruses, various serotypes, whose structure and properties vary somewhat, have been characterized. Among these serotypes, human adenoviruses of type 2 or 5 (Ad 2 or Ad 5) or adenoviruses of animal origin (see Application WO 94/26914) are preferably used in the context of the present invention. Among the adenoviruses of animal origin which can be used in the context of the present invention, there may be mentioned adenoviruses of canine, bovine, murine (example: Mav1, Beard et al., Virology 75 (1990) 81), ovine, porcine, avian or simian (example: SAV) origin. Preferably, the adenovirus of animal origin is a canine adenovirus, more preferably a CAV2 adenovirus [Manhattan or A26161 strain (ATCC VR-800) for example]. Preferably, adenoviruses of human or canine or mixed origin are used in the context of the invention. Preferably, the defective adenoviruses of the invention comprise the ITRs, a sequence allowing the encapsidation and the sequence encoding the ABC1 protein. Advantageously, in the genome of the adenoviruses of the invention, the E1 region at least is made nonfunctional. Still more preferably, in the genome of the adenoviruses of the invention, the E1 gene and at least one of the E2, E4 and L1-L5 genes are nonfunctional. The viral gene considered may be made nonfunctional by any technique known to a person skilled in the art, and in particular by total suppression, by substitution, by partial deletion or by addition of one or more bases in the gene(s) considered. Such modifications may be obtained in vitro (on the isolated DNA) or in situ, for example, by means of genetic engineering techniques, or by treatment by means of mutagenic agents. Other regions may also be modified, and in particular the E3 (WO95/02697), E2 (WO94/28938), E4 (WO94/28152, WO94/12649, WO95/02697) and L5 (WO95/02697) region. According to a preferred embodiment, the adenovirus according to the invention comprises a deletion in the E1 and E4 regions and the sequence encoding ABC1 is inserted at the level of the inactivated E1 region. According to another preferred embodiment, it comprises a deletion in the E1 region at the level of which the E4 region and the sequence encoding ABC1 (French Patent Application FR94 13355) are inserted.

[0250] The defective recombinant adenoviruses according to the invention may be prepared by any technique known to persons skilled in the art (Levrero et al., Gene 101 (1991) 195, EP 185 573; Graham, EMBO J. 3 (1984) 2917). In particular, they may be prepared by homologous recombination between an adenovirus and a plasmid carrying, inter alia, the DNA sequence encoding the ABC1 protein. The homologous recombination occurs after cotransfection of said adenoviruses and plasmid into an appropriate cell line. The cell line used must preferably (i) be transformable by said elements, and (ii), contain the sequences capable of complementing the part of the defective adenovirus genome, preferably in integrated form in order to avoid the risks of recombination. By way of example of a line, there may be mentioned the human embryonic kidney line 293 (Graham et al., J. Gen. Virol. 36 (1977) 59) which contains in particular, integrated into its genome, the left part of the genome of an Ad5 adenovirus (12%) or lines capable of complementing the E1 and E4 functions as described in particular in Applications No. WO 94/26914 and WO95/02697.

[0251] Next, the adenoviruses which have multiplied are recovered and purified according to conventional molecular biological techniques, as illustrated in the examples.

[0252] As regards the adeno-associated viruses (AAV), they are DNA viruses of a relatively small size, which integrate into the genome of the cells which they infect, in a stable and site-specific manner. They are capable of infecting a broad spectrum of cells, without inducing any effect on cellular growth, morphology or differentiation. Moreover, they do not appear to be involved in pathologies in humans. The genome of AAVs has been cloned, sequenced and characterized. It comprises about 4700 bases, and contains at each end an inverted repeat region (ITR) of about 145 bases, serving as replication origin for the virus. The remainder of the genome is divided into 2 essential regions carrying the encapsidation functions: the left hand part of the genome, which contains the rep gene, involved in the viral replication and the expression of the viral genes; the right hand part of the genome, which contains the cap gene encoding the virus capsid proteins.

[0253] The use of vectors derived from AAVs for the transfer of genes in vitro and in vivo has been described in the literature (see in particular WO 91/18088; WO 93/09239; U.S. Pat. No. 4,797,368, U.S. Pat. No. 5,139,941, EP 488 528). These applications describe various constructs derived from AAVs, in which the rep and/or cap genes are deleted and replaced by a gene of interest, and their use for transferring in vitro (on cells in culture) or in vivo (directly into an organism) said gene of interest. However, none of these documents either describes or suggests the use of a recombinant AAV for the transfer and expression in vivo or ex vivo of an ABC1 protein, or the advantages of such a transfer. The defective recombinant AAVs according to the invention may be prepared by cotransfection, into a cell line infected with a human helper virus (for example an adenovirus), of a plasmid containing the sequence encoding the ABC1 protein bordered by two AAV inverted repeat regions (ITR), and of a plasmid carrying the AAV encapsidation genes (rep and cap genes). The recombinant AAVs produced are then purified by conventional techniques.

[0254] As regards the herpesviruses and the retroviruses, the construction of recombinant vectors has been widely described in the literature: see in particular Breakfield et al., New Biologist 3 (1991) 203; EP 453242, EP178220, Bernstein et al. Genet. Eng. 7 (1985) 235; McCormick, BioTechnology 3 (1985) 689, and the like.

[0255] In particular, the retroviruses are integrating viruses, infecting dividing cells. The genome of the retroviruses essentially comprises two LTRs, an encapsidation sequence and three coding regions (gag, pol and env). In the recombinant vectors derived from retroviruses, the gag, pol and env genes are generally deleted, completely or partially, and replaced with a heterologous nucleic acid sequence of interest. These vectors may be produced from various types of retroviruses such as in particular MoMuLV (“murine moloney leukemia virus”; also called MoMLV), MSV (“murine moloney sarcoma virus”), HaSV (“harvey sarcoma virus”); SNV (“spleen necrosis virus”); RSV (“rous sarcoma virus”) or Friend's virus.

[0256] To construct recombinant retroviruses containing a sequence encoding the ABC1 protein according to the invention, a plasmid containing in particular the LTRs, the encapsidation sequence and said coding sequence is generally constructed, and then used to transfect a so-called encapsidation cell line, capable of providing in trans the retroviral functions deficient in the plasmid.

[0257] Generally, the encapsidation lines are therefore capable of expressing the gag, pol and env genes. Such encapsidation lines have been described in the prior art, and in particular the PA317 line (U.S. Pat. No. 4,861,719), the PsiCRIP line (WO 90/02806) and the GP+envAm-12 line (WO 89107150). Moreover, the recombinant retroviruses may contain modifications at the level of the LTRs in order to suppress the transcriptional activity, as well as extended encapsidation sequences, containing a portion of the gag gene (Bender et al., J. Virol. 61 (1987) 1639). The recombinant retroviruses produced are then purified by conventional techniques.

[0258] To carry out the present invention, it is most particularly advantageous to use a defective recombinant adenovirus. The results given below indeed demonstrate the particularly advantageous properties of adenoviruses for the in vivo expression of a protein having a cholesterol transport activity. The adenoviral vectors according to the invention are particularly advantageous for a direct administration in vivo of a purified suspension, or for the ex vivo transformation of cells, in particular autologous cells, in view of their implantation. Furthermore, the adenoviral vectors according to the invention exhibit, in addition, considerable advantages, such as in particular their very high infection efficiency, which makes it possible to carry out infections using small volumes of viral suspension.

[0259] According to another particularly advantageous embodiment of the invention, a line producing retroviral vectors containing the sequence encoding the ABC1 protein is used for implantation in vivo. The lines which can be used to this end are in particular the PA317 (U.S. Pat. No. 4,861,719), PsiCrip (WO 90/02806) and GP+envAm-12 (U.S. Pat. No. 5,278,056) cells modified so as to allow the production of a retrovirus containing a nucleic sequence encoding an ABC1 protein according to the invention. For example, totipotent stem cells, precursors of blood cell lines, may be collected and isolated from a subject. These cells, when cultured, may then be transfected with the retroviral vector containing the sequence encoding the ABC1 protein under the control of viral, nonviral or nonviral promoters specific for macrophages or under the control of its own promoter. These cells are then reintroduced into the subject. The differentiation of these cells will be responsible for blood cells expressing the ABC1 protein, in particular for monocytes which, when transformed to macrophages, participate in the removal of cholesterol from the arterial wall. These macrophages expressing the ABC1 protein will have an increased capacity to metabolize cholesterol in excess and will make it available to the cell surface for its removal by the primary acceptors of membrane cholesterol.

[0260] Advantageously, in the vectors of the invention, the sequence encoding the ABC1 protein is placed under the control of signals allowing its expression in the infected cells. These may be expression signals which are homologous or heterologous, that is to say signals different from those which are naturally responsible for the expression of the ABC1 protein. They may also be in particular sequences responsible for the expression of other proteins, or synthetic sequences. In particular, they may be sequences of eukaryotic or viral genes or derived sequences, stimulating or repressing the transcription of a gene in a specific manner or otherwise and in an inducible manner or otherwise. By way of example, they may be promoter sequences derived from the genome of the cell which it is desired to infect, or from the genome of a virus, and in particular the promoters of the E1A or MLP genes of adenoviruses, the CMV promoter, the RSV-LTR and the like. Among the eukaryotic promoters, there may also be mentioned the ubiquitous promoters (HPRT, vimentin, α-actin, tubulin and the like), the promoters of the intermediate filaments (desmin, neurofilaments, keratin, GFAP, and the like), the promoters of therapeutic genes (of the MDR, CFTR or factor VIII type, and the like), tissue-specific promoters (pyruvate kinase, villin, promoter of the fatty acid binding intestinal protein, promoter of the smooth muscle cell α-actin, promoters specific for the liver; Apo Al, Apo All, human albumin and the like) or promoters corresponding to a stimulus (steroid hormone receptor, retinoic acid receptor and the like). In addition, these expression sequences may be modified by addition of enhancer or regulatory sequences and the like. Moreover, when the inserted gene does not contain expression sequences, it may be inserted into the genome of the defective virus downstream of such a sequence.

[0261] In a specific embodiment, the invention relates to a defective recombinant virus comprising a nucleic sequence encoding an ABC1 protein involved in the metabolism of cholesterol under the control of a promoter chosen from RSV-LTR or the CMV early promoter.

[0262] As indicated above, the present invention also relates to any use of a virus as described above for the preparation of a pharmaceutical composition for the treatment and/or prevention of pathologies linked to the transport of cholesterol.

[0263] The present invention also relates to a pharmaceutical composition comprising one or more defective recombinant viruses as described above. These pharmaceutical compositions may be formulated for administration by the topical, oral, parenteral, intranasal, intravenous, intramuscular, subcutaneous, intraocular or transdermal route and the like. Preferably, the pharmaceutical compositions of the invention contain a pharmaceutically acceptable vehicle for an injectable formulation, in particular for an intravenous injection, such as for example into the patients portal vein. They may relate in particular to isotonic sterile solutions or dry, in particular, freeze-dried, compositions which, upon addition depending on the case of sterilized water or physiological saline, allow the preparation of injectable solutions. Direct injection into the patient's portal vein is advantageous because it makes it possible to target the infection at the level of the liver and thus to concentrate the therapeutic effect at the level of this organ.

[0264] The doses of defective recombinant virus used for the injection may be adjusted as a function of various parameters, and in particular as a function of the viral vector, of the mode of administration used, of the relevant pathology or of the desired duration of treatment. In general, the recombinant adenoviruses according to the invention are formulated and administered in the form of doses of between 10⁴ and 10¹⁴ pfu/ml, and preferably 10⁶ to 10¹⁰ pfu/ml. The term pfu (“plaque forming unit”) corresponds to the infectivity of a virus solution, and is determined by infecting an appropriate cell culture and measuring, generally after 48 hours, the number of plaques of infected cells. The techniques for determining the pfu titer of a viral solution are well documented in the literature.

[0265] As regards retroviruses, the compositions according to the invention may directly contain the producing cells, with a view to their implantation.

[0266] In this regard, another subject of the invention relates to any mammalian cell infected with one or more defective recombinant viruses as described above. More particularly, the invention relates to any population of human cells infected with these viruses. These may be in particular cells of blood origin (totipotent stem cells or precursors), fibroblasts, myoblasts, hepatocytes, keratinocytes, smooth muscle and endothelial cells, glial cells and the like.

[0267] The cells according to the invention may be derived from primary cultures. These may be collected by any technique known to persons skilled in the art and then cultured under conditions allowing their proliferation. As regards more particularly fibroblasts, these may be easily obtained from biopsies, for example according to the technique described by Ham [Methods Cell. Biol. 21a (1980) 255]. These cells may be used directly for infection with the viruses, or stored, for example by freezing, for the establishment of autologous libraries, in view of a subsequent use. The cells according to the invention may be secondary cultures, obtained for example from preestablished libraries (see for example EP 228458, EP 289034, EP 400047, EP 456640).

[0268] The cells in culture are then infected with the recombinant viruses, in order to confer on them the capacity to produce a biologically active ABC1 protein. The infection is carried out in vitro according to techniques known to persons skilled in the art. In particular, depending on the type of cells used and the desired number of copies of virus per cell, persons skilled in the art can adjust the multiplicity of infection and optionally the number of infectious cycles produced. It is clearly understood that these steps must be carried out under appropriate conditions of sterility when the cells are intended for administration in vivo. The doses of recombinant virus used for the infection of the cells may be adjusted by persons skilled in the art according to the desired aim. The conditions described above for the administration in vivo may be applied to the infection in vitro. For the infection with retroviruses, it is also possible to coculture the cells which it is desired to infect with cells producing the recombinant retroviruses according to the invention. This makes it possible to dispense with the purification of the retroviruses.

[0269] Another subject of the invention relates to an implant comprising mammalian cells infected with one or more defective recombinant viruses as described above or cells producing recombinant viruses, and an extracellular matrix. Preferably, the implants according to the invention comprise 10⁵ to 1⁰¹⁰ cells. More preferably, they comprise 10⁶ to 10⁸ cells.

[0270] More particularly, in the implants of the invention, the extracellular matrix comprises a gelling compound and optionally a support allowing the anchorage of the cells.

[0271] For the preparation of the implants according to the invention, various types of gelling agents may be used. The gelling agents are used for the inclusion of the cells in a matrix having the constitution of a gel, and for promoting the anchorage of the cells on the support, where appropriate. Various cell adhesion agents can therefore be used as gelling agents, such as in particular collagen, gelatin, glycosaminoglycans, fibronectin, lectins and the like. Preferably, collagen is used in the context of the present invention. This may be collagen of human, bovine or murine origin. More preferably, type I collagen is used.

[0272] As indicated above, the compositions according to the invention advantageously comprise a support allowing the anchorage of the cells. The term anchorage designates any form of biological and/or chemical and/or physical interaction causing the adhesion and/or the attachment of the cells to the support. Moreover, the cells may either cover the support used, or penetrate inside this support, or both. It is preferable to use in the context of the invention a solid, nontoxic and/or biocompatible support. In particular, it is possible to use polytetrafluoroethylene (PTFE) fibers or a support of biological origin.

[0273] The present invention thus offers a very effective means for the treatment or prevention of pathologies linked to the transport of cholesterol, in particular obesity, hypertriglyceridemia, or, in the field of cardiovascular conditions, myocardial infarction, angina, sudden death, cardiac decompensation and cerebrovascular accidents.

[0274] In addition, this treatment may be applied to both humans and any animals such as ovines, bovines, domestic animals (dogs, cats and the like), horses, fish and the like.

[0275] Recombinant Host Cells

[0276] The invention also relates to a recombinant host cell comprising any of the nucleic acids of the invention, and more particularly a nucleic acid having the sequence SEQ ID NO 91, 94 or 96

[0277] According to another aspect, the invention also relates to a recombinant host cell comprising a recombinant vector as described above.

[0278] The preferred host cells according to the invention are for example the following:

[0279] a) prokaryotic host cells: strains of Escherichia coli (strain DH5-α), of Bacillus subtilis, of Salmonella typhimurium, or strains of species such as Pseudomonas, Streptomyces and Staphylococus;

[0280] b) eukaryotic host cells: HeLa cells (ATCC No. CCL2), Cv 1 cells (ATCC No. CCL70), COS cells (ATCC No. CRL 1650), Sf-9 cells (ATCC No. CRL 1711), CHO cells (ATCC No. CCL-61) or 3T3 cells (ATCC No. CRL-6361).

[0281] Mutated ABC1 Polypeptides

[0282] According to another aspect, the invention relates to a polypeptide encoded by a mutated ABC1 gene, and more particularly a mutated ABC1 gene in patients suffering from a deficiency in the reverse transport of cholesterol, most particularly in patients suffering from Tangier disease.

[0283] As indicated above, two deleterious mutations have been identified in patients suffering from Tangier disease.

[0284] The first mutation corresponds to the insertion of a fragment of about one hundred base pairs into the coding sequence, at the level of exon 12 of the ABC1 gene, leading to the production of a biologically inactive polypeptide of 2233 amino acids having the sequence SEQ ID NO 140. The mutated ABC1 polypeptide having the sequence SEQ ID NO 140 possesses, compared with the normal polypeptide having the sequence SEQ ID NO 139, the following differences:

[0285] a) a deletion of a peptide fragment having the sequence “DERKFW” and the replacement of this peptide fragment with the sequence “EYSGVTSAHCNLCLLSSSDSRASASQVAGITAPATTPG” encoded by the inserted Alu-type nucleotide fragment.

[0286] The second mutation relates to the introduction of an early stop codon into the first quarter of the coding sequence, at the level of exon 13 of the ABC1 gene, leading to the production of a truncated polypeptide having 574 amino acids having the sequence SEQ ID NO 141. In addition, the deletion of the G base induces a change in the reading frame leading to a protein whose COOH-terminal end is not found in the amino acid sequence of the normal ABC1 polypeptide. This is the COOH-terminal sequence “RAPRRKLVSICNRCPIPVTLMTSFCG” of the mutated ABC1 polypeptide having the sequence SEQ ID NO 141.

[0287] These two polypeptides are useful in particular for the preparation of antibodies specifically recognizing them. Such antibodies constitute means of detection of the production of these mutated ABC1 polypeptides in a sample obtained from a subject to be tested, preferably a patient having symptoms characteristic of a deficiency in the reverse transport of cholesterol, and most preferably in a patient having the symptoms characteristic of Tangier disease.

[0288] According to another aspect, the invention therefore relates to a polypeptide comprising an amino acid sequence SEQ ID NO 140.

[0289] According to another aspect, the invention relates to a polypeptide comprising an amino acid sequence SEQ ID NO 141.

[0290] The invention also relates to a polypeptide comprising an amino acid sequence having at least 80% amino acid identity with an amino acid sequence chosen from the group consisting of the peptides having the sequences SEQ ID NO 140 and 141, or a peptide fragment thereof.

[0291] A first preferred peptide fragment will comprise at least 5 consecutive amino acids of the peptide fragment having the sequence “EYSGVTSAHCNLCLLSSSDSRASASQVAGITAPATTPG” contained in the mutated ABC1 polypeptide having the sequence SEQ ID NO 140.

[0292] A second preferred peptide fragment will comprise at least 5 consecutive amino acids of the peptide fragment having the sequence “RAPRRKLVSICNRCPIPVTLMTSFCG” contained in the mutated ABC1 polypeptide having the sequence SEQ ID NO 141.

[0293] Avantageously, a polypeptide having at least 85%, 90%, 95% or 99% amino acid identity with an amino acid sequence chosen from the group consisting of the peptides having the sequences SEQ ID NO 140 and 141, or a peptide fragment thereof, forms part of the invention.

[0294] Preferably, polypeptides according to the invention will have a length of 15, 18 or 20 to 25, 35, 40, 50, 70, 80, 100 or 200 consecutiuve amino acids of a nucleic acid according to the invention, in particular a polypeptide having an amino acid sequence chosen from the sequences SEQ ID No 140 and 141.

[0295] Alternatively, a polypeptide according to the invention will consist of and/or will comprise the fragments having a length of 15, 18, 20, 25, 35, 40, 50, 100 or 200 consecutive amino acids of a polypeptide according to the invention, more particularly of a polypeptide chosen from the sequences SEQ ID NO 140 and 141.

[0296] Generally, the polypeptides according to the invention are provided in an isolated or purified form.

[0297] The invention also relates to a method for the production of one of the polypetides having the sequences SEQ ID NO 140 and 141, or of a peptide fragment or of a variant thereof, said method comprising the steps of:

[0298] a) inserting a nucleic acid encoding said polypeptide into an appropriate vector;

[0299] b) culturing, in an appropriate culture medium, a previously transformed host cell or transfecting with the recombinant vector of step a);

[0300] c) recovering the conditioned culture medium or lysing the host cell, for example by sonication or by osmotic shock;

[0301] d) separating and purifying said polypeptide from said culture medium or alternatively from the cell lysates obtained in step c);

[0302] e) where appropriate, characterizing the recombinant polypeptide produced.

[0303] The peptides according to the invention may be characterized by binding to an immunoaffinity chromatography column on which the antibodies directed against this polypeptide or against a fragment or a variant thereof have been previously immobilized.

[0304] According to another aspect, a recombinant polypeptide according to the invention may be purified by passing over an appropriate series of chromatography columns, according to methods known to persons skilled in the art and described for example in F. Ausubel et al (1989).

[0305] A polypeptide according to the invention may also be prepared by conventional chemical synthesis techniques either in homogeneous solution or in solid phase.

[0306] By way of illustration, a polypeptide according to the invention may be prepared by the technique either in homogeneous solution described by Houben Weyl (1974) or the solid phase synthesis technique described by Merrifield (1965a; 1965b).

[0307] Polypeptides termed “homologous” to any one of the polypeptides having the amino acid sequences SEQ ID NO 140 and 141, or their fragments or variants, also form part of the invention.

[0308] Such homologous polypeptides have amino acid sequences possessing one or more substitutions of an amino acid by an equivalent amino acid, relative to the reference polypeptides.

[0309] Equivalent amino acid according to the present invention will be understood to mean for example replacement of a residue in the L form by a residue in the D form or the replacement of a glutamic acid (E) by a pyro-glutamic acid according to techniques well known to persons skilled in the art. By way of illustration, the synthesis of peptide containing at least one residue in the D form is described by Koch (1977).

[0310] According to another aspect, two amino acids belonging to the same class, that is to say two uncharged polar, nonpolar, basic or acidic amino acids, are also considered as equivalent amino acids.

[0311] Polypeptides comprising at least one nonpeptide bond such as a retro-inverse bond (NHCO), a carba bond (CH₂CH₂) or a ketomethylene bond (CO—CH₂) also form part of the invention.

[0312] Preferably, the polypeptides according to the invention comprising one or more additions, deletions, substitutions of at least one amino acid will retain their capacity to be recognized by antibodies directed against the nonmodified polypeptides.

[0313] Antibodies

[0314] The mutated ABC1 polypeptides according to the invention, in particular the polypeptides having the amino acid sequences SEQ ID NO 140-141] or the fragments thereof as well as the homologous peptides may be used for the preparation of antibodies, in particular for detecting the production of altered forms of the ABC1 polypeptide in a patient.

[0315] A first preferred antibody according to the invention is directed against a peptide fragment comprising at least 5 consecutive amino acids of the peptide fragment having the sequence “EYSGVTSAHCNLCLLSSSDSRASASQVAGITAPATTPG” contained in the mutated ABC1 polypeptide having the sequence SEQ ID NO 140.

[0316] A second preferred antibody according to the invention is directed against a peptide fragment comprising at least 5 consecutive amino acids of the peptide fragment having the sequence “RAPRRKLVSICNRCPIPVTLMTSFCG” contained in the mutated ABC1 polypeptide having the sequence SEQ ID NO 141.

[0317] “Antibody” for the purposes of the present invention will be understood to mean in particular polyclonal or monoclonal antibodies or fragments (for example the F (ab)′₂ and Fab fragments) or any polypeptide comprising a domain of the initial antibody recognizing the target polypeptide or polypeptide fragment according to the invention.

[0318] Monoclonal antibodies may be prepared from hybridomas according to the technique described by Kohler and Milstein (1975).

[0319] The present invention also relates to antibodies directed against a polypeptide as described above or a fragment or a variant thereof, as produced in the trioma technique or the hybridoma technique described by Kozbor et al. (1983).

[0320] The invention also relates to single-chain Fv antibody fragments (ScFv) as described in U.S. Pat. No. 4,946,778 or by Martineau et al. (1998).

[0321] The antibodies according to the invention also comprise antibody fragments obtained with the aid of phage libraries Ridder et al., (1995) or humanized antibodies Reinmann et al. (1997); Leger et al., (1997).

[0322] The antibody preparations according to the invention are useful in immunological detection tests intended for the identification of the presence and/or of the quantity of antigens present in a sample.

[0323] An antibody according to the invention may comprise, in addition, a detectable marker which is isotopic or nonisotopic, for example fluorescent, or may be coupled to a molecule such as biotin, according to techniques well known to persons skilled in the art.

[0324] Thus, the subject of the mention is, in addition, a method of detecting the presence of a polypeptide in accordance with the invention in a sample, said method comprising the steps of:

[0325] a) bringing the sample to be tested into contact with an antibody as described above;

[0326] b) detecting the antigen/antibody complex formed.

[0327] The invention also relates to a box or kit for diagnosis or for detecting the presence of a polypeptide in accordance with the invention in a sample, said box comprising:

[0328] a) an antibody as defined above;

[0329] b) a reagent allowing the detection of the antigen/antibody complexes formed.

[0330] Pharmaceutical Compositions and Therapeutic Methods of Treatment

[0331] The invention also relates to pharmaceutical compositions intended for the prevention or treatment of a deficiency in the metabolism of cholesterol such as atherosclerosis, particularly in the transport of cholesterol, and still more particularly in the reverse transport of cholesterol, characterized in that they comprise a therapeutically effective quantity of a polynucleotide capable of giving rise to the production of an effective quantity of the normal ABC1 polypeptide, in particular of the polypeptide having the sequence SEQ iD NO 139.

[0332] The subject of the invention is, in addition, pharmaceutical compositions intended for the prevention or treatment of a deficiency in the metabolism of cholesterol such as atherosclerosis, particularly in the transport of cholesterol, and still more particularly in the reverse transport of cholesterol, characterized in that they comprise a therapeutically effective quantity of the normal ABC1 polypeptide, in particular of the polypeptide having the sequence SEQ ID NO 139.

[0333] Such pharmaceutical compositions will be advantageously suitable for the administration, for example by the parenteral route, of a quantity of the ABC1 polypeptide ranging from 1 μg/kg/day to 10 mg/kg/day, preferably at least 0.01 mg/kg/day and most preferably between 0.01 and 1 mg/kg/day.

[0334] The pharmaceutical compositions according to the invention may be equally well administered by the oral, rectal, parenteral, intravenous, subcutaneous or intradermal route.

[0335] The invention also relates to the use of the ABC1 polypeptide having the sequence SEQ ID NO 139 for the manufacture of a medicament intended for the prevention of Atherosclerosis in various forms or more particularly for the treatment of subjects affected by a dysfunction in the reverse transport of cholesterol.

[0336] The invention finally relates to a pharmaceutical composition for the prevention or treatment of subjects affected by a dysfunction in the reverse transport of cholesterol, comprising a therapeutically effective quantity of the polypeptide having the sequence SEQ ID NO 139.

[0337] According to another aspect, the subject of the invention is also a preventive or curative therapeutic method of treating diseases caused by a deficiency in the metabolism of cholesterol, more particularly in the transport of cholesterol and still more particularly in the reverse transport of cholesterol, such a method comprising a step in which there is administered to a patient a polynucleotide capable of giving rise to the expression of the ABC1 polypeptide in said patient, said polynucleotide being, where appropriate, combined with one or more physiologically compatible vehicles and/or excipients.

[0338] Preferably, a pharmaceutical composition comprising a polynucleotide, as defined above, will be administered to the patient.

[0339] According to yet another aspect, the subject of the invention is also a preventive or curative therapeutic method of treating diseases caused by a deficiency in the metabolism of cholesterol, more particularly in the transport of cholesterol and still more particularly in the reverse transport of cholesterol, such a method comprising a step in which there is administered to a patient a therapeutically effective quantity of the ABC1 polypeptide in said patient, said polypeptide being, where appropriate, combined with one or more physiologically compatible vehicles and/or excipients.

[0340] Preferably, a pharmaceutical composition comprising a polypeptide, as defined above, will be administered to the patient.

[0341] Methods of Screening an Agonist or Antagonist Compound for the ABC1 Polypeptide

[0342] According to another aspect, the invention also relates to various methods of screening compounds for therapeutic use which are useful in the treatment of diseases due to a deficiency in the metabolism of cholesterol, particularly in the transport of cholesterol, still more particularly in the reverse transport of cholesterol, such as Tangier disease, or more generally FHD-type conditions.

[0343] The invention therefore also relates to the use of the ABC1 polypeptide, or of cells expressing the ABC1 polypeptide, for screening active ingredients for the prevention or treatment of diseases resulting from a dysfunction in the reverse transport of cholesterol.

[0344] The catalytic sites and oligopeptide or immunogenic fragments of the ABC1 polypeptide can serve for screening product libraries by a whole range of existing techniques. The fragment used in this type of screening may be free in solution, bound to a solid support, at the cell surface or in the cell. The formation of the binding complexes between the ABC1 fragments and the tested agent can then be measured.

[0345] Another product screening technique which may be used in high-flux screenings giving access to products having affinity for the protein of interest is described in application WO84/03564. In this method, applied to the ABC1 protein, various products are synthesized on a solid surface. These products react with the ABC1 protein or fragments thereof and the complex is washed. The products binding the ABC1 protein are then detected by methods known to persons skilled in the art. Nonneutralizing antibodies can also be used to capture a peptide and immobilize it on a support.

[0346] Another possibility is to use a product screening using the ABC1 neutralizing antibody competition, the ABC1 protein and a product potentially binding the ABC1 protein. In this manner, the antibodies may be used to detect the presence of peptide having common antigenic units with ABC1.

[0347] In the products to be evaluated and making it possible to increase the ABC1 activity, there may be mentioned in particular the kinase-specific ATP homologs involved in the activation of the molecules as well as phosphatases which may be able to avoid the dephosphorylation resulting from said kinases. There may be mentioned in particular inhibitors of of the phosphodiesterase (PDE) theophylline and 3-isobutyl-1-methylxanthine type or the adenylcyclase forskolin activators.

[0348] Accordingly, we claim in this invention the use of any method of screening products based on the method of translocation of cholesterol (see Example 17) between the membranes or vesicles, this being in all synthetic or cellular types, that is to say of mammals, insects, bacteria or yeasts expressing constitutively or having incorporated the human ABC1 sequence. To this effect, labeled lipid analogs may be used.

[0349] Likewise, it has been described that the ABC1 protein allowed anion transport (Becq et al. Journal of Biological Chemistry vol 272, No. 5 pages 2695-2699, 1997 and Yamon et al. Blood vol 90, No. 8 pages 2911-2915, 1997) and this transport was activated by phosphatase inhibitors such as okadaic acid and orthovanadate as well as part of the elevation of cAMP by agents such as forskolin. We claim the use of this system for screening molecules modulating the activity of the ABC1 protein (see Example 18).

[0350] Yamon et al (Blood vol 90, No. 8 pages 2911-2915, 1997) have demonstrated that the mouse ABC1 protein was involved in the secretion of a proinflammatory cytokine IL-1 beta in mouse peritoneal macrophages. It is therefore also possible to provide a method of screening products modulating the activity of the ABC1 protein by determining the release of IL-1beta from any cell type expressing two proteins (see Example 19).

[0351] Furthermore, knowing that the disruption of numerous transporters have been described (van, den Hazel. H., H. Pichler, V. M. M. do, E. Leitner, A. Goffeau, and G. Daum. 1999. PDR16 and PDR17, two homologous genes of Saccharomyces cerevisiae, affect lipid biosynthesis and resistance to multiple drugs. J. Biol. Chem. 274 (4):193441), it is possible to think of using cellular mutants having a characteristic phenotype and to complement the function thereof with ABC1 and to use the whole for screening purposes.

[0352] The invention also relates to a method of screening a compound active on the metabolism of cholesterol, an agonist or antagonist of the ABC1 polypeptide, said method comprising the following steps:

[0353] a) preparing membrane vesicles containing the ABC1 polypeptide and a lipid substrate comprising a detectable marker;

[0354] b) incubating the vesicles obtained in step a) with an agonist or antagonist candidate compound;

[0355] c) qualitatively and/or quantitatively measuring the release of the lipid substrate comprising a detectable marker;

[0356] d) comparing the measurement obtained in step b) with a measurement of the release of the labeled lipid substrate by vesicles which have not been previously incubated with the agonist or antagonist candidate compound.

[0357] According to a first aspect of the above screening method, the membrane vesicles are synthetic lipid vesicles, which may be prepared according to techniques well known to persons skilled in the art. According to this particular aspect, the ABC1 protein may be a recombinant ABC1 protein.

[0358] According to a second aspect, the membrane vesicles are vesicles of plasma membranes derived from cells expressing the ABC1 polypeptide. These may be cells naturally expressing the ABC1 polypeptide or cells transfected with a recombinant vector encoding the ABC1 polypeptide.

[0359] According to a third aspect of the above screening method, the lipid substrate is chosen from cholesterol or phosphatidyicholine.

[0360] According to a fourth aspect, the lipid substrate is radioactively labeled, for example with an isotope chosen from ³H or ¹²⁵I.

[0361] According to a fifth aspect, the lipid substrate is labeled with a fluorescent compound, such as NBD or pyrene.

[0362] According to a sixth aspect, the membrane vesicles comprising the labeled lipid substrate and the ABC1 polypeptide are immobilized at the surface of a solid support prior to step b).

[0363] According to a seventh aspect, the measurement of the fluorescence or of the radioactivity released by the vesicles is the direct reflection of the activity of lipid substrate transport by the ABC1 polypeptide.

[0364] The invention also relates to a method of screening a compound active on the metabolism of cholesterol, an agonist or antagonist of the ABC1 polypeptide, said method comprising the following steps:

[0365] a) obtaining cells, for example a cell line, expressing naturally or after transfection the ABC1 polypeptide;

[0366] b) incubating the cells of step a) in the presence of an anion labeled with a detectable marker,

[0367] c) washing the cells of step b) in order to remove the excess of the labeled anion which has not penetrated into these cells;

[0368] d) incubating the cells obtained in step c) with an agonist or antagonist candidate compound for the ABC1 polypeptide;

[0369] e) measuring the efflux of the labeled anion;

[0370] f) comparing the value of the efflux of the labeled anion determined in step e) with the value of the efflux of the labeled anion measured with cells which have not been previously incubated in the presence of the agonist or antagonist candidate compound for the ABC1 polypeptide.

[0371] According to a first aspect of the above screening method, the cells used are cells naturally expressing the ABC1 polypeptide. They may be human monocytes in primary culture, purified from a population of human blood mononuclear cells. They may also be human monocytic cell lines, such as the monocytic leukemia line THP1.

[0372] According to a second aspect, the cells used in the screening method described above may be cells not naturally expressing, or alternatively expressing at a low level, the ABC1 polypeptide, said cells being transfected with a recombinant vector according to the invention capable of directing the expression of the ABC1 polypeptide.

[0373] According to a third aspect, the cells may be cells having a natural deficiency in anion transport, or cells pretreated with one or more anion channel inhibitors such as Verapamil™ or tetraethylammonium.

[0374] According to a fourth aspect of said screening method, the anion is a radioactively labeled iodide, such as the salts K¹²⁵I or Na¹²⁵I.

[0375] According to a fifth aspect, the measurement of the efflux of the labeled anion is determined periodically over time during the experiment, thus making it possible to also establish a kinetic measurement of this efflux.

[0376] According to a sixth aspect, the value of the efflux of the labeled anion is determined by measuring the quantity of labeled anion present at a given time in the cell culture supernatant.

[0377] According to a seventh aspect, the value of the efflux of the labeled anion is determined as the proportion of radioactivity found in the cell culture supernatant relative to the total radioactivity corresponding to the sum of the radioactivity found in the cell lysates and the radioactivity found in the cell culture supernatant.

[0378] The subject of the invention is also a method of screening a compound active on the metabolism of cholesterol, an agonist or antagonist of the ABC1 polypeptide, said method comprising the following steps:

[0379] a) culturing cells of a human monocytic line in an appropriate culture medium, in the presence of purified human albumin;

[0380] b) incubating the cells of step a) simultaneously in the presence of a compound stimulating the production of IL-1 beta and of the agonist or antagonist candidate compound;

[0381] c) incubating the cells obtained in step b) in the presence of an appropriate concentration of ATP;

[0382] d) measuring IL-1 beta released into the cell culture supernatant.

[0383] e) comparing the value of the release of the IL-1 beta obtained in step d) with the value of the IL-1 beta released into the culture supernatant of cells which have not been previously incubated in the presence of the agonist or antagonist candidate compound.

[0384] According to a first aspect of the screening method described above, the cells used belong to the human leukemic monocytic line THP1.

[0385] According to a second aspect of the screening method, the compound stimulating the production of IL-1 beta is a lipopolysaccharide. According to a third aspect of said method, the production of IL-1 alpha, IL-6 and TNF alpha by these cells is also qualitatively and/or quantitatively determined.

[0386] According to a fourth aspect, the level of expression of the messenger RNA encoding IL-1 beta is also determined.

[0387] The invention is illustrated, without being limited as a result, by the following figures and examples:

[0388]FIG. 1 illustrates the segregation of the mutation by insertion of an Alu sequence into exon 12 of the ABC1 gene. The insertion-deletion in exon 12 of the ABC1 gene constitutes a deletion of 14 nucleotides and of an insertion of 110 nucleotides as represented in FIG. 1A. FIG. 1B represents the Nu pedigree and the size of the DNA fragments obtained for each of the patients after PCR amplification of exon 12. Lane M corresponds to the mobility markers (Gibco BRL). Lane C corresponds to a control DNA.

[0389]FIG. 2 illustrates the mutation by deletion of a single nucleotide in exon 13 of the ABC1 gene. The sequence of the complementary strand was obtained and is represented from 3′ to 5′. The sequence encoding amino acids 546 to 552 of the ABC1 polypeptide is represented for different patients of the family studied, respectively for a homozygous (FIG. 2-a), heterozygous (FIG. 2-b) and nonaffected (FIG. 2-c) individual. The sequence of FIG. 2-a was found in three homozygous individuals, the sequence of FIG. 2-b was found in five heterozygous individuals and the sequence of FIG. 2-c was found in four nonaffected individuals.

EXAMPLES Example 1 Tissue Distribution of the Transcripts of the ABC1 Gene According to the Invention

[0390] The profile of expression of the polynucleotides according to the present invention is determined according to the protocols for PCR-coupled reverse transcription and Northern blot analysis described in particular by Sambrook et al. (ref. CSH Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). “Molecular Cloning: A Laboratory Manual,” 2nd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).

[0391] For example, in the case of an analysis by reverse transcription, a pair of primers synthesized from the complete DNA of the human ABC1 gene having the sequence SEQ ID NO 91 is used to detect the corresponding cDNA.

[0392] The polymerase chain reaction (PCR) is carried out on cDNA templates corresponding to retrotranscribed polyA⁺ mRNAs (Clontech). The reverse transcription to cDNA is carried out with the enzyme SUPERSCRIPT II (GibcoBRL, Life Technologies) according to the conditions described by the manufacturer. The polymerase chain reaction is carried out according to standard conditions, in 20 μl of reaction mixture with 25 ng of cDNA preparation. The reaction mixture is composed of 400 μM of each of the dNTPs, 2 units of Thermus aquaticus (Taq) DNA polymerase (Ampli Taq Gold; Perkin Elmer), 0.5 μM of each primer, 2.5 mM MgCl2, and PCR buffer. Thirty four PCR cycles (denaturing 30 s at 94° C., annealing of 30 s divided up as follows during the 34 cycles: 64° C. 2 cycles, 61° C. 2 cycles, 58° C. 2 cycles and 55° C. 28 cycles and an extension of one minute per kilobase at 72° C.) are carried out after a first step of denaturing at 94° C. for 10 min in a Perkin Elmer 9700 thermocycler. The PCR reactions are visualized on agarose gel by electrophoresis. The cDNA fragments obtained may be used as probes for a Northern blot analysis and may also be used for the exact determination of the polynucleotide sequence.

[0393] In the case of a Northern Blot analysis, a cDNA probe produced as described above is labeled with ³²P by means of the DNA labeling system High Prime (Boehringer) according to the instructions indicated by the manufacturer. After labeling, the probe is purified on a Sephadex G50 microcolumn (Pharmacia) according to the instructions indicated by the manufacturer. The labeled and purified probe is then used for the detection of the expression of the mRNAs in various tissues.

[0394] The Northern blot containing samples of RNA of different human tissues (Multiple Tissue Northern, MTN, Clontech) Blot 2, reference 77759-1) is hybridized with the labeled probe.

[0395] The protocol followed for the hybridizations and washes may be either directly as that described by the manufacturer (Instruction manual PT1200-1) or an adaptation of this protocol using methods known to persons skilled in the art and described for example in F. Ausubel et al (1999). It is thus possible to vary, for example, the prehybridization and hybridization temperatures in the presence of formamide.

[0396] For example, it may be possible to use the following protocol:

[0397] 1—Membrane Competition and Prehybridization:

[0398] Mix: 40 μl salmon sperm DNA (10 mg/ml)+40 μl human placental DNA (10 mg/ml)

[0399] Denature for 5 min at 96° C., then immerse the mixture in ice.

[0400] Remove the 2×SSC and pour 4 ml of formamide mix In the hybridization tube containing the membranes.

[0401] Add the mixture of the two denatured DNAs.

[0402] Incubation at 42° C. for 5 to 6 hours, with rotation.

[0403] 2—Labeled Probe Competition:

[0404] Add to the labeled and purified probe 10 to 50 μl Cot I DNA, depending on the quantity of repeat sequences.

[0405] Denature for 7 to 10 min at 95° C.

[0406] Incubate at 65° C. for 2 to 5 hours.

[0407] 3—Hybridization:

[0408] Remove the prehybridization mix, Mix 40 μl salmon sperm DNA+40 μl human placental DNA; denature for 5 min at 96° C., then immerse in ice.

[0409] Add to the hybridization tube 4 ml of formamide mix, the mixture of the two DNAs and the denatured labeled probe/Cot I DNA.

[0410] Incubate 15 to 20 hours at 42° C., with rotation.

[0411] 4—Washes:

[0412] One wash at room temperature in 2×SSC, to rinse.

[0413] Twice 5 minutes at room temperature 2×SSC and 0.1% SDS at 65° C.

[0414] Twice 15 minutes at 65° C. 1×SSC and 0.1% SDS at 65° C.

[0415] After hybridization and washing, the blot is analyzed after overnight exposure in contact with a phosphorus screen revealed with the aid of Storm (Molecular Dynamics, Sunnyvale, Calif.).

Example 2 Production of the Complete cDNA of the ABC1 Gene

[0416] The sequence of the 3′-UTR region of the cDNA of the human ABC1 gene was identified by searching in databases.

[0417] An iterative screening of a database of EST sequences (“Genbank mouse human subdivision EST, v.111”) was carried out with the aid of the BLAST software.

[0418] Oligonucleotide primers were synthesized from the partial consensus sequence derived from the EST sequences, in order to amplify by an RT-PCR reaction the 3′ end of the cDNA of the human ABC1 gene, and then to determine the sequence thereof.

[0419] The oligonucleotide primers used are the following: 1. 5′-AAACCAGACAGTAGTGGACG-3′, (SEQ ID NO 142) 2. 5′-GTTACTGCCACCAGAACAGC-3′, (SEQ ID NO 143) 3. 5′-TGATAAGCTGTTCTGGTGGC-3′, (SEQ ID NO 144) 4. 5′-CTTGGCTTTTGCATTGTTGC-3′, (SEQ ID NO 145) 5. 5′-CAATGCAAAAGCCAAGAAAG-3′, (SEQ ID NO 146) 6. 5′-TGCAACGATGCCATATCAC-3′, (SEQ ID NO 147) 7. 5′-CAACTCCTTACTTCGGTTCCTC-3′, (SEQ ID NO 148) 8. 5′-GTTTTCTGAGGTGTCCCAAAG-3′ (SEQ ID NO 149)

[0420] The reverse transcription of the poly(A)+ mRNA from brain, fetal brain, heart, uterus and placenta tissues (Libraries marketed by the company Clontech) was carried out by extension with the aid of oligodT primers using the Superscript™ kit (marketed by the company Life Technologies Inc.), according to the manufacturer's instructions.

[0421] In each experiment, it was possible to exclude the presence of contaminating DNA because of the absence of PCR-amplified polynucleotides in the samples not containing reverse transcriptase.

[0422] A PCR reaction was carried out on the products which have been subjected or otherwise to a first step of reverse transcription under the following conditions:

[0423] 400 μM dNTP, 2 Units of Taq DNA polymerase (Thermus aquaticus, Ampli Taq Gold, marketed by the company Perkin Elmer), 0.5 μM of each of the primers, 2.5 mM of Mg Cl₂, the whole being present in a PCR buffer also containing 50 ng of DNA and about 25 ng of cDNA.

[0424] The PCR reaction was carried out for 30 cycles in a thermocycler apparatus (“Perkin Elmer 9700 Thermal Cycler”) in 96-well microplates.

[0425] After an initial denaturation at 94° C. for 10 minutes, each cycle was carried out in the following manner:

[0426] step of denaturation at 94° C. for 30 seconds; step of annealing for 30 seconds (at 64° C. for 2 cycles, at 61° C. for 2 cycles, at 58° C. for 2 cycles and at 55° C. for 28 cycles).

[0427] step of extension for a period corresponding to 1 minute per kilobase.

[0428] The PCR reaction was stopped by a final extension step of 7 minutes.

[0429] Various other approaches may be used to isolate the cDNA corresponding to the complete cDNA of ABC1.

[0430] For example, a complete clone may be directly isolated by hybridization by screening a cDNA library by means of a polynucleotide probe specific for the sequence of the gene of interest.

[0431] In particular, a specific probe of 3040 nucleotides is synthesized using a synthesizer of the Applied Biosystem/Perkin Elmer trademark depending on the chosen sequence.

[0432] The oligonucleotide obtained is radiolabeled, for example with [γ-³²P]ATP using T4 polynucleotide kinase and is purified according to the customary methods (e.g Maniatis et al. Molecular cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, N.Y. 1982 or F. Ausubel et al. (Current Protocols in Molecular Biology, J. Wiley and Sons Eds, 1989).

[0433] The clone library containing the cDNA which it is desired to screen is established on a culture medium in a Petri dish (1.5% agar) containing the appropriate antibiotics according to the customary methods cited above (F. Ausubel et al.). The colonies thus produced after incubation are transferred on nitrocellulose filters and screened by means of the radiolabeled nucleotide probe, according to the customary methods and the colonies hybridizing with the probe are isolated and subcloned.

[0434] The DNA of the clones thus identified is prepared and analyzed by sequencing. The clones containing the fragments corresponding to the complete cDNA are purified and recloned into the vector pcDNA3 according to the protocols known to persons skilled in the art and presented for example in F. Ausubel et al (1989).

[0435] Various methods are known for identifying the 5′ and 3′ ends of the cDNA corresponding to the genes described in the present application. These methods include but are not limited to hybridization cloning, to cloning using protocols similar or identical to 3′ or 5′ RACE-PCR (Rapid Amplification of cDNA End-PCR) which are well known to persons skilled in the art.

[0436] For example, it will be possible to use the kit marketed by the company Clontech (Marathon Ready™ cDNA kit, protocol identified by the reference PT1156-1), or alternatively a method similar to 5′RACE is available for characterizing the absent 5′ end of a cDNA (Fromont-Racine et al. Nucleic Acid Res. 21(7):1683-1684 (1993)). Briefly, an RNA oligonucleotide is ligated to the 5′ end of an mRNA population. After retrotranscription to cDNA, a set of primers specific respectively for the adaptor ligated in 5′ and for a sequence situated in 3′ of the gene of interest is used in PCR to amplify the 5′ portion of the desired cDNA. The amplified fragment is then used to reconstruct the complete cDNA.

Example 3 Analysis of the Gene Expression Profile for Tangier Disease

[0437] The verification of the impairment of the level of expression of the ABC1 gene causing the Tangier cellular phenotype may be determined by hybridizing these sequences with probes corresponding to the mRNAs obtained from fibroblasts of subjects suffering or otherwise from the disease, according to the methods described below:

[0438] 1. Preparation of the Total RNAs, of the poly(A)⁺ mRNAs and of cDNA Probes

[0439] The total RNAs are obtained from cell cultures of the fibroblasts of normal-subjects or subjects suffering from Tangier disease by the guanidine isothiocyanate method (Chomczynski & Sacchi, 1987). The poly(A)⁺ mRNAs are obtained by affinity chromatography on oligo(dT)-cellulose columns (Sambrook et al., 1989) and the cDNAs used as probes are obtained by RT-PCR (DeRisi et al., 1997) with oligonucleotides labeled with a fluorescent product (Amersham Pharmacia Biotech CyDye™).

[0440] 2. Hydridization and Detection of the Expression Levels

[0441] The glass membranes containing the sequences presented in this patent application, corresponding to the Tangier gene are hybridized with the cDNA probes obtained from fibroblasts (lyer et al., 1999). The use of the Amersham/molecular Dynamics system (Avalanche Microscanner™) allows the quantification of the expressions of the products of sequences on healthy or affected cell types.

Example 4 Construction of the Expression Vector Containing the Complete cDNA of ABC1 in Mammalian Cells

[0442] The ABC1 gene may be expressed in mammalian cells. A typical eukaryotic expression vector contains a promoter which allows the initiation of the transcription of the mRNA, a sequence encoding the protein, and the signals required for the termination of the transcription and for the polyadenylation of the transcript. It also contains additional signals such as enhancers, the Kozak sequence and sequences necessary for the splicing of the mRNA. An effective transcription is obtained with the early and late elements of the SV40 virus promoters, the retroviral LTRs or the CMV virus early promoter. However, cellular elements such as the actine promoter may also be used. Many expression vectors may be used to carry out the present invention such as the vector pcDNA3.

Example 5 Production of Normal and Mutated ABC1 Polypeptides

[0443] The normal ABC1 polypeptide encoded by the complete cDNA of ABC1 whose isolation is described in Example 2 (cloning of the complete cDNA), or the mutated ABC1 polypeptides whose complete cDNA may also be obtained according to the techniques described in Example 2, may be easily produced in a bacterial or insect cell expression system using the baculovirus vectors or in mammalian cells with or without the vaccinia virus vectors. All the methods are now widely described and are known to persons skilled in the art. A detailed description thereof will be found for example in F. Ausubel et al. (1989).

Example 6 Production of an Antibody Directed Against One of the Mutated ABC1 Polypeptides

[0444] The antibodies in the present invention may be prepared by various methods (Current Protocols In Molecular Biology Volume 1 edited by Frederick M. Ausubel, Roger Brent, Robert E. Kingston, David D. Moore, J. G. Seidman, John A. Smith, Kevin Struhl—Massachusetts General Hospital Harvard Medical School, chapter 11). For example, the cells expressing a polypeptide of the present invention are injected into an animal in order to induce the production of serum containing the antibodies. In one of the methods described, the proteins are prepared and purified so as to avoid contaminations. Such a preparation is then introduced into the animal with the aim of producing polyclonal antisera having a higher activity.

[0445] In the preferred method, the antibodies of the present invention are monoclonal antibodies. Such monoclonal antibodies may be prepared using the hybridoma technique (Köhler et al, Nature 256:495 (1975); Köhler et al, Eur. J. Immunol. 6:511 (1976); Köhler et al, Eur. J. Immunol. 6:292 (1976); Hammeling et al., in: Monoclonal Antibodies and T-Cell Hybridomas, Elsevier, N.Y, pp. 563-681 51981). In general, such methods involve immunizing the animal (preferably a mouse) with a polypeptide or better still with a cell expressing the polypeptide. These cells may be cultured in a suitable tissue culture medium. However, it is preferable to culture the cells in an Eagle medium (modified Earle) supplemented with 10% fetal bovine serum (inactivated at 56° C.) and supplemented with about 10 g/l of nonessential amino acids, 1000 U/ml of penicillin and about 100 μg/ml of streptomycin.

[0446] The splenocytes of these mice are extracted and fused with a suitable myeloma cell line However, it is preferable to use the parental myeloma cell line (SP2O) available from the ATCC. After fusion, the resulting hybridoma cells are selectively maintained in HAT medium and then cloned by limiting dilution as described by Wands et al. (Gastroenterology 80:225-232 (1981)). The hybridoma cells obtained after such a selection are tested in order to identify the clones secreting antibodies capable of binding to the polypeptide.

[0447] Moreover, other antibodies capable of binding to the polypeptide may be produced according to a 2-stage procedure using anti-idiotype antibodies such a method is based on the fact that the antibodies are themselves antigens and consequently it is possible to obtain an antibody recognizing another antibody. According to this method, the antibodies specific for the protein are used to immunize an animal, preferably a mouse. The splenocytes of this animal are then used to produce hybridoma cells, and the latter are screened in order to identify the clones which produce an antibody whose capacity to bind to the specific antibody-protein complex may be blocked by the polypeptide. These antibodies may be used to immunize an animal in order to induce the formation of antibodies specific for the protein in a large quantity.

[0448] It would be advantageous if Fab and F(ab′)2 and the other fragments of the antibodies of the present invention can be used according to the methods described here. Such fragments are typically produced by proteolytic cleavage with the aid of enzymes such as Papain (in order to produce the Fab fragments) or Pepsin (in order to produce the F(ab′)2 fragments). Otherwise, the secreted fragments recognizing the protein may be produced by applying the recombinant DNA or synthetic chemistry technology.

[0449] For the in vivo use of antibodies in humans, it would be preferable to use “humanized” chimeric monoclonal antibodies. Such antibodies may be produced using genetic constructs derived from hybridoma cells producing the monoclonal antibodies described above. The methods for producing the chimeric antibodies are known to persons skilled in the art (for a review, see: Morrison, Science 229:1202 (1985); Oi et al., Biotechnique 4:214 (1986); Cabilly et al., U.S. Pat. No. 4,816,567; Taniguchi et al., EP 171496; Morrison et al., EP 173494; Neuberger et al., WO 8601533; Robinson et al., WO 8702671; Boulianne et al; Nature 312:643 (1984); Neuberger et al., Nature 314: 268 (1985)).

Example 7 Correction of the Cellular Phenotype of the Tangier Disease

[0450] The Tangier disease is characterized by an accelerated catabolism of the high-density lipoprotein (HDL) particles and an accumulation of cholesterol in the tissues. In particular, the fibroblasts of the skin of patients suffering from Tangier disease have a reduced capacity to eliminate their cholesterol content by the process of efflux of cholesterol carried out by apolipoprotein A-I (apoA-I), the major protein of the HDLs (Francis et al., 1995). This characteristic corresponding to a loss of function is also found in other fibroblast cells of patients suffering from familial HDL deficiency (Marcil et al., 1999).

[0451] The correction of the phenotype of the Tangier fibroblasts can be carried out by the transfection of the complete cDNA of ABC1 according to the invention, into said cells. The cDNA is inserted into an expression vector which is then transfected according to the methods described below:

[0452] 1. Preparation of the Fibroblast Cultures of Normal Subjects and of Subjects Suffering from Tangier Disease

[0453] The primary fibroblasts of human skin are obtained by culturing a skin biopsy obtained from the forearm. These biopsies are performed on patients suffering from Tangier disease having the clinical and biochemical features of the “homozygotes”, that is to say orange-colored tonsils, plasma concentrations of apoA-1 and of cholesterol-HDL less than the 5^(th) percentile. The normal fibroblast lines are obtained from the American Type Culture Collection (Rockville, Md.). The fibroblasts are cultured in an EMMEM (Eagle-modified minimium essential medium; GIBCO) medium supplemented with 10% fetal calf serum, 2 mM glutamine, 100 IU/ml of penicillin and 100 μg/ml of steptomycin (medium designated EMMEM10). In order to carry out the study of the efflux of cholesterol, these cells are preloaded with cholesterol by incubating for 24 hours with 50 μg/ml of cholesterol in the medium described above without calf serum but containing 2 mg/ml of bovine albumin (BSA, fraction V).

[0454] 2. Study of the Efflux of Cholesterol

[0455] The fibroblasts preloaded with cholesterol at confluence on 24-well plates are incubated in the EMMEM10 medium and 1 μCi/ml of 1,2-³H-cholesterol (50 Ci/mmol; Dupont; Wilmington, Del.) for 48 hours. About 100,000 counts per minute are obtained per well or 1000 counts per minute and per pg of cellular protein. The cells are washed three times with EMMEM/BSA medium, and incubated with this medium for 24 hours before transfecting the gene of interest and starting the efflux by adding 10 μg/ml of proteoliposome containing apoA-1 in EMMEM/BSA medium. These proteoliposomes are prepared by sonication of phosphatidylcholine and purified human apoA-I (Jonas, 1986). The cell transfection is carried out by the calcium phosphate precipitation technique (Sambrook et at., 1989). After the period of efflux, in general 20 hours, the medium is collected, centrifuged (1000 g, 5 min), and the radioactivity determined by liquid scintillation counting. The residual radioactivity in the cells is also determined overnight after extraction of the lipids in isopropanol. The percentage efflux is calculated by dividing the radioactivity measured in the supernatant by the sum of the radioactivities measured, in the supernatant and the cellular extract. An internal standard is prepared by transfection of a marker gene and incubation for 24 hours with an EMMEM/IBSA medium without proteoliposome containing apoA-I. The efflux of cellular cholesterol from normal fibroblasts transfected with a control gene correspond to 6+2% whereas that obtained from fibroblasts suffering from Tangier disease and transfected with this control gene is less than 1%. On the other hand, the transfection of the fibroblasts suffering from Tangier disease with a plasmid containing the complete cDNA or the genomic DNA for ABC1 according to the invention could make it possible to restore the capacity of these cells to eliminate their excess of cholesterol at a level corresponding to that of normal fibroblasts.

Example 8 Isolation and Characterization of Genomic Fragments of the Human ABC1 Gene

[0456] A fragment of about 3 kb of the human ABC1 cDNA was obtained from the cDNA clone designated “pf10” containing the first ATP-binding domain of ABC1, this cDNA clone being described in the article by Luciani et al. (1994).

[0457] This cDNA fragment obtained by digestion of the clone pf10 with the aid of the restriction endonuclease EcoRI, was isolated on an agarose gel after electrophoresis, then labeled with digoxigenin according to the manufacturer's instructions (kit marketed by Boehringer Mannheim, reference 1 585 614).

[0458] The labeled cDNA fragment was used to screen the LLNL (Lawrence Livermore National Labs) cosmid library of chromosome 9, immobilized on a Nylon™ filter.

[0459] Six positive clones were identified. For these six cosmids, the probe hybridized with single colonies.

[0460] A representative clone was isolated from each of these colonies.

[0461] The clones LLNLC 131J087 Q2 (designated here cos3a) and LLNLc 131O1165 Q2 (designated here cos6f) were analyzed in greater detail.

[0462] The clone cos3a was subcloned in the form of an EcoRI fragment into the vector Gen3zf(−) and sequenced at both ends using the Big Dye Terminator technology on an AB1377 type sequencer (Applied Biosystems, Perkin Elmer).

[0463] The clones containing distinct inserts (determined after sequencing of the ends of the various inserts or by determining the size thereof) which were too long to be completely sequenced with the aid of the primers hybridizing with the sequences of the vector, were analyzed more before by the technique of transposon insertion and then of sequencing with the aid of primers specific to the transposon (“GPS” system marketed by the company New England Biolabs).

[0464] In this manner, genomic sequences corresponding to the human ABC1 gene were isolated and characterized. These sequences were compared with human and mouse sequences identified by references in the databases making it possible to determine the intron-exon junctions.

Example 9 Determination of Polymorphisms/Mutations in the ABC1 Gene

[0465] The detection of polymorphisms or of mutations in the sequences of the transcripts or in the genomic sequence of the ABC1 gene may be carried out according to various protocols. The preferred method is direct sequencing. For patients from whom it is possible to obtain an mRNA preparation, the preferred method consists in preparing the cDNAs and sequencing them directly. For patients for whom only DNA is available, and in the case of a transcript where the structure of the corresponding gene is unknown or partially known, it is necessary to precisely determine its intron-exon structure as well as the genomic sequence of the corresponding gene. This therefore involves, in a first instance, isolating the genomic DNA BAC or cosmid clone(s) corresponding to the transcript studied according to the method described in Example 8, sequencing the insert of the corresponding clone(s) and determining the intron-exon structure by comparing the cDNA sequence to that of the genomic DNA obtained.

[0466] The technique of detection of mutations by direct sequencing consists in comparing the genomic sequences of the ABC1 gene obtained from homozygotes for the disease or from at least 8 individuals (4 individuals affected by the pathology studied and 4 individuals not affected). The sequence divergences constitute polymorphisms. All those modifying the amino acid sequence of the wild-type protein may be mutations capable of affecting the function of said protein which it is advantageous to consider more particularly for the study of cosegregation of the mutation and of the disease (denoted genotype-phenotype correlation) in the pedigree or in the studies of case/control association for the analysis of the sporadic cases

Example 10 Identification of a Causal Gene for a Disease Linked to a Deficiency in the Reverse Transport of Cholesterol by Causal Mutation or a Transcriptional Difference

[0467] Among the mutations identified according to the method described in Example 9, all those associated with the disease phenotype are capable of being causal. Validation of these results is made by sequencing the gene in all the affected individuals and their relations (whose DNA is available). Moreover, the carrying out of Northern blotting or RT-PCR, according to the method described in Example 1, using RNA specific to affected or nonaffected individuals makes it possible to detect notable variations in the level of expression of the gene studied, in particular in the absence of transcription of the gene.

Example 11 Identification of a Deletion of a Nucleotide in Exon 13 of the ABC1 Gene in Patients Suffering from TANGIER Disease

[0468] The analysis of mutations in the ABC1 gene was carried out on genomic DNA from several individuals belonging to a family of which several members suffer from Tangier disease with premature coronary disorders.

[0469] A deletion of one nucleotide was identified in exon 13 (DG 1764: Leu548Leu;575 End). This deletion introduces a stop codon at position 575 which makes it possible to predict a truncation of the ABC1 protein encoded by the mutated ABC1 gene, this truncation leading to the synthesis of a polypeptide deleted of a large portion of the normal amino acid sequence, and in particular of the two cassettes for binding to ATP.

[0470] A perfect correlation between the observation of the symptoms of the disease and the presence of this deletion of one nucleotide was found in the entire family (FIG. 1).

Example 12 Identification of an Insertion of a Segment of Nucleotides into Exon 12 of the ABC1 Gene

[0471] In another family in which several members suffer from Tangier disease, an insertion of 110 base pairs having the structure of a repeated nucleotide sequence of the Alu-sq type, accompanied by a deletion of 14 base pairs in exon 12, was observed (FIG. 2). This insertion/deletion mutation makes it possible to predict a deletion of 65 amino acids (DERKFW) as well as an insertion in phase of 38 amino acids (EYSGVTSAHCNLCLLSSSDSRASASQVAGITAPATTPG).

[0472] This mutation does not allow the synthesis of a normal ABC1 transport polypeptide. It is therefore possible to conclude that Tangier disease, in the individuals in this family, is caused by a deficiency in the ABC1 gene.

Example 13 Identification of Biallelic Polymorphisms in the ABC1 Gene

[0473] Primers for the amplification of the DNA of the patients were designed from nonrepetitive sequences of the intron DNA of the ABC1 gene, in such a way that an amplification of the intron-exon junctions as well as the bases essential for the formation of the secondary structure during the RNA splicing step are included in the amplified fragments.

[0474] The various pairs of primers specifically developed are presented in Table V.

[0475] The results found on the DNA from a family containing cases of Tangier disease without coronary complication are shown in Table IV.

[0476] The genomic DNA of the patients was amplified with the aid of the primers described above using Qiagen's Star Taq kit or the Supertaq kit, using the hybridization conditions and the amplification cycle conditions recommended by the manufacturer.

[0477] The amplified PCR products were then purified using a kit marketed by the company Qiagen, and then sequenced by the Big Dye Terminator method on an AB1377 sequencer (Applied Biosystems, Perkin Elmer).

Example 14 Identification of a Region of 1 cM on the 9q31 Locus Associated with Tangier Disease

[0478] A first linkage analysis was described in the article by Rust et al. (1998).

[0479] This article presented a linkage analysis on three families of patients suffering from Tangier disease and defined a candidate interval of 0.8 cM in 9q31.

[0480] The applicant has carried out a linkage study by including four additional families as well as additional markers identified by references in public databases in order to refine the candiate region to about 1 cM, with reference to the genetic map published by Généthon (Dib et al., 1996)

[0481] The results of linkage analysis presented below allowed the applicant to exclude from the candidate region the genomic segments respectively proximal (centromeric) and distal (telomeric) to the markers D9S271 and D9S1866.

[0482] The candidate region is therefore located between these two excluded markers.

[0483] An important piece of information which made it possible to refine the candidate region was obtained from the E1 portion of the pedigree described in the article by Rust et al. (1998). Indeed, it has been shown on the maternal chromosome, at the origin of the E121m portion, that the recombination event (crossing-over) already described in FIG. 2 of this article (between the markers D9S277 and D9S53), in fact ought to be located telomerically relative to the marker D9S271. The centromeric boundary of the candidate interval being situated between D9S271 and D9S277.

[0484] In two of the new families, respectively family “S1” and family “NU”, additional recombination events were observed. These recombination events made it possible to move the telomeric boundary of the candidate interval, from the marker D9S1677 (described in the article by Rust et al 1998) to the marker D9S1866.

[0485] The first pedigree, “S1”, was extended. The affected individuals exhibit a homozygous genotype for all the markers of the 8 cM region as well as for the more distant markers, located on either side of this region. One of the cousins of the individual S1, related to S1 by both parents (double consanguinity) has four children. Two of these children exhibit on their chromosome of paternal origin conservation of a large portion of the diseased haplotype (in the defined region of 8 cM). These two children also exhibit the typical characteristic of the heterozygous parents of the family suffering from Tangier disease, namely an HDL level which is half the level observed in patients not affected by the disease.

[0486] However, the character homozygous for the markers is no longer observed in the chromosomal region starting from the marker D9S1866 (which is heterozygous in these individuals), which made it possible to define D9S1866 as the telomeric boundary of the candidate region.

[0487] The same telomeric boundary was observed in the “Nu” family, in which one of the four children from parents who were first degree cousins, was a patient affected by homozygous Tangier disease.

[0488] A homozygosity for the markers on the entire candidate region was observed in this patient.

[0489] One of his brothers, a heterozygote (at the phenotype level), exhibits a recombination event (crossing-over) on one of the two chromosomes, such that this brother is homozygous (at the genetic level) for all the telomeric markers including the marker D9S1866, but heterozygous (at the genetic level) for the markers located in the region near the centromer.

[0490] As this patient did not exhibit a phenotype of a homozygous patient, it was possible to exclude the telomeric region including the marker D9S1866.

Example 15 Isolation and Characterization of the Human ABC1 Gene

[0491] A fragment of about 3 kb of the human ABC1 cDNA was obtained from a cDNA clone designated “pf10” containing the first ATP-binding domain of ABG1, this cDNA clone being described in the article by Luciani et al. (1994).

[0492] This cDNA fragment obtained by digestion of the clone pf10 with the aid of the restriction endonuclease EcoRI, was isolated on an agarose gel after electrophoresis, then labeled with digoxigenin according to the manufacturer's instructions (kit marketed by Boehringer Mannheim).

[0493] The labeled cDNA fragment was used to screen the LLNL cosmid library of chromosome 9, immobilized on a Nylon™ filter.

[0494] Six positive clones were identified. For these six cosmids, the probe hybridized with single colonies.

[0495] A representative clone was isloated from each of these colonies.

[0496] The clones LLNLC 131J087 Q2 (designated here cos3a) and LLNLc 131O1165 Q2 (designated here cos6f) were analyzed in greater detail.

[0497] The clone cos3a was subcloned in the form of an EcoRI fragment into the vector Gen3zf(−) and sequenced at both ends using the Big Dye Terminator technology on an A81377 type sequencer.

[0498] The clones containing distinct inserts (determined after sequencing of the ends of the various inserts or by determining the size thereof) which were too long to be completely sequenced with the aid of the primers hybridizing with the sequences of the vector, were analyzed more before by the technique of transposon insertion and then of sequencing with the aid of primers specific to the transposon (“GPS” system marketed by the company New England Biolabs).

[0499] In this manner, genomic sequences corresponding to the human ABC1 gene were isolated and characterized. These sequences were compared with human and mouse sequences identified by references in the databases.

[0500] The sequences of the intron-exon junctions were determined.

[0501] Primers for the amplification of the DNA of the patients were designed from nonrepetitive sequences of the intron DNA of the ABC1 gene, in such a way that an amplification of the intron-exon junctions as well as the bases essential for the formation of the secondary structure during the splicing step are included in the amplified fragments.

[0502] The genomic DNA of the patients was amplified with the aid of the primers described above using Qiagen's Star Taq kit or the Supertaq kit, using the hybridization conditions and the amplification cycle conditions recommended by the manufacturer.

[0503] The amplified PCR products were then purified using a kit marketed by the company Qiagen, and then sequenced by the Big Dye Terminator method on an ABI377 sequencer.

Example 16 Construction of Recombinant Vectors Containing a Polynucleotide Encoding the ABC1 Protein

[0504] I. Synthesis of of the Human ABC1 Gene.

[0505] Total RNA (500 ng) isolated from human placental tissue (Clontech, Palo Alto, Calif., USA) was used as source for the synthesis of the cDNA of the human ABC1 gene, using the system “Superscript one step RT-PCR (Life Technologies, Gaithersburg, Md., USA) and the oligonucleotide primers specific for ABC1 (0.25 μM) below: forward primer: 5′-CTACCCACCCTATGAACAAC-3′ (nt 75-94 of ABC1 cDNA); backward primer: 5′-1GCCACCCCGTATGAACAGGG-3′ (nt 6731-6751 of ABC1 cDNA).

[0506] These oligonucleotide primers were synthesized by the phosphoramidite method on a DNA synthesizer of the ABI 394 type (Applied Biosystems, Foster City, Calif., USA).

[0507] The sites recognized by the restriction enzyme NotI were incorporated into the amplified cDNA of 6676 bp by a new amplification step using 50 ng of human ABC1 cDNA as template, and 0.25 μM of the oligonucleotide primers described above containing, at their 5′ end, the site recognized by the restriction enzyme NotI, in the presence of 200 μM of each of said dideoxynucleotides dATP, dCTP, dTTP and dGTP as well as the Pyrococcus furiosus DNA polymerase (Stratagene, Inc. La Jolla, Calif., USA).

[0508] The PCR reaction was carried out over 30 cycles each comprising a step of denaturation at 95° C. for one minute, a step of renaturation at 50° C. for one minute and a step of extension at 72° C. for two minutes, in a thermocycler apparatus for PCR (Cetus Perkin Elmer Norwalk, Conn., USA).

[0509] II. Cloning of the cDNA of the Human ABC1 Gene into an Expression Vector:

[0510] The 6676 bp insert of the human ABC1 cDNA was cloned into the NotI restriction site of the expression vector pCMV containing a cytomegalovirus early promoter and an enhancer sequence as well as the SV40 polyadenylation signal (Beg et al., 1990; Applebaum-Boden, 1996), in order to produce the expression vector designated pABC1.

[0511] The sequence of the cloned cDNA was confirmed by sequencing on the two strands using the reaction set “ABI Prism Big Dye Terminator Cycle Sequencing ready” (marketed by Applied Biosystems, Foster City, Calif., USA) in a capillary sequencer of the ABI 1310 type (Applied Biosystems, Foster City, Calif., USA).

[0512] III. Construction of a Recombinant Adenoviral Vector Containing the cDNA of the Human ABC1 Gene

[0513] A—Modification of the Expression Vector pCMV-β.

[0514] The β-galactosidase cDNA of the expression vector pCMV-β(Clontech, Palo Alto, Calif., USA, Gene Bank Accession No. U02451) was deleted by digestion with the restriction endonuclease NotI and replaced with a multiple cloning site containing, from the 5′ end to the 3′ end, the following sites:

[0515] NotI, AscI, RsrII, AvrlI, SwaI, and NotI (sequence of the multiple cloning site: 5′-CGGCCGCGGCGCGCCCGGACCGCCTAGGATTTAAATCGCGGCCCGC G-3′

[0516] this multiple cloning site having been cloned at the level of the NotI site.

[0517] The DNA fragment between the EcoRI and SanI sites of the modified expression vector pCMV was isolated and cloned into the modified XbaI site of the shuttle vector pXCXII (McKinnon et al., 1982; McGrory et al., 1988).

[0518] B—Modification of the Shuttle Vector PXCXII.

[0519] A multiple cloning site comprising, from the 5′ end to the 3′ end the XbaI, EcoRI, SfiI, PmeI, NheI, SrfI, PacI, SalI and XbaI restriction sites (having the sequence: 5′-GCTCTAGAATTCGGCCTCCGTGGCCGTTTAAACGCTAGCGCCCGGG CTTAATTAAGTCGACTCTAGAGC-3′)

[0520] was inserted at the level of the XbaI site (nucleotide at position 3329) of the vector pXCXII (McKinnon et al., 1982; McGrory et al., 1988).

[0521] The EcoRI-SalI DNA fragment isolated from the modified vector pCMV-β containing the CMV promoter/enhancer, the donor and acceptor splicing sites of FV40 and the polyadenylation signal of FV40 was then cloned into the EcoRi-SalI site of the modified shuttle vector pXCX, designated pCMV-11.

[0522] C—Preparation of the Shuttle Vector pAD12-ABC1.

[0523] The human ABC1 cDNA is obtained by an RT-PCR reaction, as described above, and cloned at the level of the NotI site into the vector pCMV-12, resulting in the obtaining of the vector pCMV-ABC1.

[0524] The ABC1 cDNA contained in the vector pCMV-ABC1 consists of a DNA fragment of 6676 bp comprising the sequence going from the nucleotide at position 75 to the nucleotide at position 6751 of the human ABC1 cDNA.

[0525] D. Construction of the ABC1 Recombinant Adenovirus.

[0526] The ABC1-rldV recombinant adenovirus containing the human ABC1 cDNA was constructed according to the technique described by McGrory et al. (1988).

[0527] Briefly, the vector pAD12-ABC1 was cotransfected with the vector tGM17 according to the technique of CHEN and OKAYAMA (1987).

[0528] Likewise, the vector pAD12-Luciferase was constructed and cotransfected with the vector pJM17.

[0529] The recombinant adenoviruses were identified by PCR amplification and subjected to two purification cycles before a large-scale amplification in the human embryonic kidney cell line HEK 293 (American Type Culture Collection, Rockville, Md., USA).

[0530] The infected cells were collected 48 to 72 hours after their infection with the adenoviral vectors and subjected to five freeze-thaw lysing cycles.

[0531] The crude lysates were extracted with the aid of Freon (Halocarbone 113, Matheson Product, Scaucus, N.J. USA), sedimented twice in cesium chloride supplemented with 0.2% murine albumine (Sigma Chemical Co., St Louis, Mo., USA) and dialysed extensively against buffer composed of 150 nM NaCl, 10 mM Hepes (pH 7,4), 5 mM KCl, 1 mM MgCl₂, and 1 mM CaCl₂.

[0532] The recombinant adenoviruses were stored at −70° C. and titrated before their administration to animals or their incubation with cells in culture.

[0533] The absence of wild-type contaminating adenovirus was confirmed by screening with the aid of PCR amplification using oligonucleotide primers located in the structural portion of the deleted region.

[0534] IV Validation of the Expression of the Human ABC1 cDNA

[0535] Polyclonal antibodies specific for the human ABC1 polypeptide were prepared in rabbits and chicks by injecting the synthetic polypeptide “LHKNQTWDVAVLTSFLQDEKVKESYV”, derived from the ABC1 protein. These polyclonal antibodies are used to detect and/or quantify the expression of the human ABC1 gene in cells and animal models by immunoblotting and/or immunodetection.

[0536] The biological activity of ABC1 may be monitored by quantifying the cholesterol fluxes induced by apoA-I using cells transfected with the vector pCMV-ABCI which have been loaded with cholesterol (Remaley et al., 1997).

[0537] V. Expression In Vitro of the Human abc1 cDNA in Cells.

[0538] Cells of the HEK293 line and of the COS-7 line (American Tissue Culture Collection, Betesda, Md., USA), as well as fibroblasts in primary culture derived from Tangier patients or from patients suffering from hypo-alphalipoproteinemia are transfected with the expression vector pCMV-ABC1 (5-25 μg) using Lipofectamine (BRL, Gaithersburg, Md., USA) or by coprecipitation with the aid of calcium chloride (Chen et al., 1987).

[0539] These cells may also be infected with the vector pABC1-AdV (Index of infection, MOI=10).

[0540] The expression of human ABC1 may be monitored by immunoblotting as well as by quantification of the efflux-of cholesterol induced by apoA-1 using transfected and/or infected cells.

[0541] The complementation of the genetic defect from which the Tangier patients and the hypo-alphalipoproteinemic patients are suffering using fibroblasts of these patients, may be confirmed by the detection of the expression of the normal ABC1 gene, which makes it possible to establish the functional importance of this receptor.

[0542] VI. Expression In Vivo of the ABC1 Gene in Various Animal Models.

[0543] An appropriate volume (100 to 300 μl) of a medium containing the purified recombinant adenovirus (pABC1-AdV or pLucif-AdV) containing from 10⁸ to 10⁹ lysis plaque-forming units (PFUs) are infused into the Saphenous vein of mice (C57BL/6, both control mice and models of transgenic or knock-out mice) on day 0 of the experiment.

[0544] The evaluation of the physiological role of the ABC1 protein in the metabolism of lipoprote ins is carried out by determining the total quantity of cholesterol, of triglycerides, of phospholipids and of free cholesterol (Sigma and Wako Chemicals, Richmond, Va., USA), of cholesterol-HDL (CIBA-Corning, Oberlin, Ohio, USA) and apolipoproteins A-I, A-II, E and B from mice (Foger et al., 1997), before (day zero) and after (days 2, 4, 7, 10, 14) the administration of the adenovirus.

[0545] Kinetic studies with the aid of radioactively labelled produces such: as apoA-I-HDL, CE-HDL as well as apoB-LDL and CE-LDL are carried out on day 5 after the administration of the vectors rLucif-AdV and rABC1-AdV in order to evaluate the effect of the expression of ABC1 on the metabolism of the HDLs and of the LDLs as well as on the release of cholesterol toward the liver.

[0546] The effect of the expression of ABC1 on the development of atherosclerosis may be evaluated by quantifying the mean surface area of aortic lesion in apoE mice after administration of the vector rABC1-Adv.

[0547] Furthermore, transgenic mice and rabbits overexpressing the ABC1 gene may be produced, in accordance with the teaching of Waisman (1995) and Hoeg (1996) using constructs containing the human ASC1 cDNA under the control of endogenous promoters such as ABC1, CMV or apoE.

[0548] The evaluation of the long-term effect of the expression of ABC1 on the kinetics of the plasma lipids, lipoproteins and apolipoproteins and on atherosclerosis may be carried out as described above.

Example 17 Use of Vesicles for the Screening of Agonist and Antagonist Molecules for the ABC1 Protein

[0549] The basis of this test is the reconstitution of membranes which have incorporated the ABC1 protein and containing substrates such as cholesterol or phopholipids. The ABC1 protein may then be activated or its function repressed by the addition of molecules of interest. The outflow of the substrates through the channel formed by the ABC1 protein is then detected.

[0550] a) Reconstitution of a Membrane Containing the ABC1 Protein and a Labelled Lipid Substrate.

[0551] Various strategies may be used to manufacture these membranes, methods using organic solvents, mechanical means such as sonication, the “French press”, or by freeze-thaw cycles or using detergents (cholates, Chaps, Chapso) (reference: Rigaud et al. Biochimica et Biophysica Acta 1231 (1995) 223-246). More particularly, a lipid substrate such as phospholipids, cholesteol or cholesterol ester, a radioactive substrate of the 3H-cholesterol, 125-I-cholesterol or 3H-phospphatidylcholine type or a fluorescent substrate with NBD or pyrene (Molecular Probes; http://www.probes.com) and phospatidylecholine from eggs (1 mM) are dried on the pellet of a glass flask. Sodium cholate and the ABC1 protein are mixed in this flask in a mol to mol ratio of 0.3. The whole is vortex-mixed for 5 minutes and then incubated at 25° C. for 30 minutes and then dialysed against a saline buffer. The proteoliposome produced according to this protocol is monitored by turbidimetry in order to verify that its manufacture is good.

[0552] b) Capture of the Proteoliposome on a Solid Surface

[0553] This step may be carried out by incorporating binding proteins of the integrine type. In this protocol, a capture by the antibodies directed against the ABC1 protein and previous adsorbed on a 96- or 384-well plate is used.

[0554] A solution containing these antibodies at the concentration of 100 μg/l are absorbed on these multiwell plates by incubating overnight at 4° C. After washing, the plate is then saturated with bovine albumine at 1 mg/ml incubated for 2 hours at 37° C. The whole is then washed and incubated with the proteoliposomes containing ABC1 for 2 hours at 37° C.

[0555] c) Binding to the Molecules of Interest

[0556] This step is carried out by incubation of products for 1 hour at 37° C.

[0557] d) Determination of the Activation or Inhibition of the ABC1 Protein

[0558] If the substrate if fluorescent, the fluorescence of the supernatant shows us the activity of a product in inducing a transport of lipid to the outside of the proteoliposome. Alternatively, the use of a Confocal system gives us information on the quantities of substrate inside and outside the proteoliposome. If the substrate is radioactive, the use of CytoStar-type plates having a bottom with scintillation liquid makes it possible to reveal the substrate still sequestered in the proteoliposome.

Example 18 Use of Anion Transport for the Screening of Agonist et Antagonist Molecules for the ABC1 Protein).

[0559] The principle of this test lies in the property which the A8C1 protein has for transporting the anions during its activation.

[0560] a) The macrophage cells of the THP-1 lines, monocytic leukemia human cells, are a model of differentiated macrophages. The cells are cultured in an RPMI 1640 medium supplemented with 10% foetal calf serum in 48-multiwell plates at the density of 2 105 cells per well. The fibroblast cells of patients suffering from Tangier disease may be used as negative control because their ABC1 protein is not functional. Another negative control may be obtained by the addition of anti-ABC1 antibodies.

[0561] b) The use of anion transport defective cells or cells treated with anion channel inhibitors (Verapamil type, an inhibitor of P-glycoprotein or tetraethylammonium, a potassium channel inhibitor) may also be used.

[0562] c) For the actual test itself, the cells are then washed with an Earles's modified salt solution (ESS) medium preloaded with 1 ml of KI at 1 μmol/L (0.1 μCi/ml of NaI125) in this ESS medium for 30 minutes at 37° C. The products are then added to the extracellular medium. The cells are then washed with the ESS medium.

[0563] d) The quantity of iiodide in the medium is detected every minute for 11 minutes. The first two points correspond to the basal efflux. At the end of the incubation, the medium is taken up and the quantity of iodine remaining in the cells is counted following lysis of the cells in 1 molar NaOH.

[0564] e) The total quantity of radioactivity at time zero is equal to the sum of the radioactivity found in the supernatant and the residual radioactivity in the cells. The efflux curves are constructed by plotting the percentage of radioactivity released into the medium as a function of time.

Example 19 Use of THP-1 Macrophages Expressing IL-1Beta for the Screening of Agonist and Antagonist Molecules for the ABC1 Protein)

[0565] The principle of this test is that any substances modulating the activity of the ABC1 protein has repercussions on the synthesis of IL-1 beta.

[0566] a) The macrophage cells of the THP-1 lines, monocytic leukemia human cells, are a model of differentiated macrophages. The cells are cultured in an RPMI 1640 medium supplemented with 10% foetal calf serum in multiwell plates at the density of 2 105 cells per well.

[0567] b) For the actual test itself, the cells are then washed and placed in an RPMI 1640 medium containing 1 mg/ml of purified human albumine fraction IV.

[0568] c) The products are added to the extracellular medium. Simultaneously, the cells are then activated by addition of lipopolysaccharide (LPS) over 3 hours at 1 μg/ml followed by an incubation of 30 minutes in the presence of ATP at 5 mmol/L.

[0569] d) The concentrations of IL-1beta and of control IL-1alpha, tumor necrosis factor alpha (TNFalpha) and IL-6 are determined by ELISA kits according to the manufacturers' instructions (R&D Sytem; human IL-1beta Chemiluminescent ELISA reference QLB00). The variations of mRNA for IL-1beta which is not supposed to be affected are evaluated by the Nothern blotting technique with the corresponding probe.

REFERENCES

[0570] Altschul S. F. et al, J. Mol. Biol. 1990 215 403-410

[0571] Altschul S. F. et al, Nucleic Acids Res. 1997 25: 3389-3402

[0572] Applebaum-Boden, JCI 97, 1996

[0573] Ausubel et al., 1989, Current Protocols in Molecular Biology, Green Publishing Associates and Wiley interscience, N.Y

[0574] Beard et al., Virology 75 (1990) 81

[0575] Beaucage et al., Tetrahedron Lett 1981, 22: 1859-1862

[0576] Becq et al Journal of Biological Chemistry vol 272, No. 5 pages 2695-2699, 1997

[0577] Beg et al PNAS 87 p3473 1990

[0578] Bender et al., J. Virol. 61 (1987) 1639

[0579] Bernstein et al. Genet. Eng. 7 (1985) 235

[0580] Boulianne et al; Nature 312:643 (1984)

[0581] Breakfield et al., New Biologist 3 (1991) 203

[0582] Brown E L, Belagaje R, Ryan M J, Khorana H G, Methods Enzymol 1979;68:109-151

[0583] Cabilly et al., U.S. Pat. No. 4,816,567

[0584] Chen and Kwok Nucleic Acids Research 25:347-353 1997

[0585] Chen et al. Proc. Natl. Acad. Sci. USA 94/20 10756-10761,1997

[0586] Chen et al., 1987, Mol. Cell. Biol., 7: 2745-2752.

[0587] Chomczynski, P. and Sacchi, 1987, Anal. Biochem., 162, 156-159.

[0588] DeRisi J. et al., 1997, Science, 278, 680-686.

[0589] Dib C. et al, 1996, Nature, 380: 152-154.

[0590] Felgner et al., PNAS 84 (1987) 7413

[0591] Flotte et al, 1992, Am. J. Respir. Cell Mol. Biol., 7: 349-356.

[0592] Forsell Y. et al., 1997, Biol. Psychiatry, 42: 898-903

[0593] Fraley et al., 1979, Proc. Natl. Acad. Sci. USA, 76: 3348-3352.

[0594] Fraley et al., J. Biol. Chem. 255 (1980) 10431

[0595] Francis et al., 1995, J. Clin. Invest., 1:78-87

[0596] Fromont-Racine et al, 1993. Nucleic Acid Res. 21(7):1683-1684

[0597] Fuller S. A. et al., 1996, Immunology in Current Protocols in Molecular Biology, Ausubel et al.

[0598] Gopal, 1985, Mol. Cell. Biol., 5: 1188-1190.

[0599] Graham et al., 1973, Virology, 52: 456-457.

[0600] Graham et al., J. Gen. Virol. 36 (1977) 59

[0601] Haff L. A. and Smirnov I. P., Genome Research, 7:378-388, 1997

[0602] Harn, Methods Cell. Biol. 21a (1980) 255

[0603] Hames B D and Higgins S J, 1985, “Nucleic acid hybridization: a practical approach”, Hames and Higgins Ed., IRL Press, Oxford.

[0604] Hammeling et al., in: Monoclonal Antibodies and T-Cell Hybridomas, Elsevier, N.Y., pp. 563-681 51981

[0605] Harland et al., 1985, J. Cell. Biol., 101: 1094-1095.

[0606] Hoeg PNAS 93 p11448, 1996

[0607] Horvath S. et al., 1998, Am. J. Hum. Genet., 1998, 63: 1886-1897.

[0608] Houben Weyl, 1974, in Meuthode der Organischen Chemie, E. Wunsch Ed., Volume 15-I and 15-II,

[0609] Huygen et al., 1996, Nature Medicine, 2(8):893-898

[0610] Kaneda et al., Science 243 (1989) 375

[0611] Kim et al. Genomics (1996), 34:213-218)

[0612] Koch Y., 1977, Biochem. Biophys. Res. Commun., 74:488-491

[0613] Köhler et al, Eur. J. Immunol. 6:511 (1976)

[0614] Köhler et al, Eur. J. Immunol. 6:292 (1976)

[0615] Köhler et al, Nature 256:495 (1975)

[0616] Kohler G. and Milstein C., 1975, Nature, 256: 495.

[0617] Kozbor et al., 1983, Hybridoma, 2(1):7-16.

[0618] Lander and Schork, Science, 265, 2037-2048, 1994

[0619] Langmann T. et al., 1999, Biochem. Biophys. Res. Comm., 257: 29-33.

[0620] Leger O J, et al., 1997, Hum Antibodies, 8(1): 3-16

[0621] Levrero et al., Gene 101 (1991

[0622] Luciani M. F. et al., 1994, Genomics, 21: 150-159.

[0623] Lyer V. et al., 1999, Science, 283: 83-87.

[0624] MacKinnon et al Gene 19 p33 1982

[0625] Maniatis et al. Molecular cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, N.Y. 1982

[0626] Marcil M. et al., 1999, Artreriosclerosis Throbosis and Vascular Biology, 17 1813-1821.

[0627] Martineau P, Jones P, Winter G, 1998, J Mol Biol, 280(1):117-127

[0628] McCormick, BioTechnology 3 (1985) 689

[0629] McGrory et al (Virology, 163, 614,1988

[0630] McLaughlin B A et al., 1996, Am. J. Hum. Genet., 59: 561-569.

[0631] Merrifield R B, 1965a, Nature, 207(996): 522-523.

[0632] Merrifield R B., 1965b, Science, 150(693): 178-185.

[0633] Morrison et al., EP 173494

[0634] Morrison, Science 229:1202 (1985)

[0635] Narang S A, Hsiung H M, Brousseau R, Methods Enzymol 1979;68:90-98

[0636] Neuberger et al., Nature 314: 268 (1985)

[0637] Neuberger et al., WO 8601533

[0638] Nickerson D. A. et al., Genomics, 1992, 12: 377-387.

[0639] Nicolau C. et al., 1987, Methods Enzymol., 149:157-76.

[0640] Oi et al., Biotechnique 4:214 (1986)

[0641] Okayama (Mol Cell. Biol. 7: 2745, 1987

[0642] Pagano et al., J. Virol. 1 (1967) 891

[0643] Potter et al., 1984, Proc Natl Acad Sci USA.; 81(22):7161-5

[0644] Reimann K A, et al., 1997, AIDS Res Hum Retroviruses. 13(11): 933-943

[0645] Remaley et al. ATVB 17,1813,1997 Chen et at Mol Cell Biol 7 p2745 1987

[0646] Ridder R, Schmitz R, Legay F, Gram H, 1995, Biotechnology (N Y), 13(3):255-260

[0647] Rigaud et al. Biochimica et Biophysica Acta 1231 (1995) 223-246

[0648] Robinson et al., WO 8702671

[0649] Rust S. et al., Nature Genetics, vol. 20, September 1998, pages 96-98

[0650] Sambrook, J. Fritsch, E. F., and T. Maniatis. 1989. Molecular cloning: a laboratory manual. 2ed. Cold Spring Harbor Laboratory, Cold spring Harbor, N.Y.

[0651] Samulski et al., 1989, J. Virol., 63: 3822-3828.

[0652] Sanchez-Pescador R., 1988, J. Clin. Microbiol., 26(10):1934-1938

[0653] Sham P. C. et al., Ann. Hum. Genet., 1995, 59: 323-336.

[0654] Sternberg N. L., 1992, Trends Genet., 8: 1-16.

[0655] Sternberg N. L., 1994, Mamm. Genome, 5: 397404.

[0656] Strautnieks S. S. et al., 1998, Nature Genetics, 20: 233-235.

[0657] Tacson et al., 1996, Nature Medicine, 2(8):888-892.

[0658] Taniguchi et al., EP 171496

[0659] Tur-Kaspa et al, 1986, Mol. Cell. Biol., 6: 716-718.

[0660] Urdea M. S., 1988, Nucleic Acids Research, 11: 49374957

[0661] Urdea M S et al., 1991, Nucleic Acids Symp Ser., 24: 197-200.

[0662] Vaisman J B C 270 p12269, 1995

[0663] Van, den Hazel. H., H. Pichler, V. M. M. do, E. Leitner, A. Goffeau, and G.

[0664] Daum. 1999. PDR16 and PDR17, two homologous genes of Saccharomyces cerevisiae, affect lipid biosynthesis and resistance to multiple drugs. J Biol Chem 274 (4):1934-41.

[0665] Wands et al., Gastroentérology 80:225-232 (1981)

[0666] Xiong M. et al., 1999, Am. J. Hum. Genet., 64: 629-640.

[0667] Yamon et al Blood vol 90, No. 8 pages 2911-2915, 1997

1 159 1 3153 DNA Homo sapiens 1 acatggcaat ggcattcatt aggaatctag ctgggaaaat ccagtgtgta tgcttggaaa 60 tgagggatct ggggctggag agaaaggcat gggcatgcct tggagggact tgtgtgtcaa 120 gctgaggacc tttactttaa gctctagggg accaggcaag gggagatgta gatacgttac 180 tctgatgggg tggatgaatt gaagaaggat gaggcaagaa tgaaggcaga gaccagggag 240 gaggctctcc aagtggccaa ggcataaagc aagaaatgag gcctggtgac tgcttagtgg 300 cagagcagtg aaagagaggg aggcatcaaa gtgagtctcg atttctagct gggtgggtgg 360 tagcgatgtc cagtaggcca gtggctactg aggtctgcag tggaggaggg tggttgggct 420 ggagacagat gatgagggag tcatcagcct gtgggtggaa gaaaagggaa cctcttccaa 480 ctgttttctt tgcttcttcc ctctctttct cttttttttt ttttttggac agagtcttgc 540 tctgtcaccc aggctgaaat gcagtggcat gatcttggct caccacagcc tccgcctcct 600 gggttcaagc aattctcctg tctcagcctc cagagtagct gggattacag gcacatatca 660 ctgtgcccgg ctaatttttg tattttcagt ggagatggga tttcaccatg ttggtcgggc 720 tggaatgaac tcctgacctc aagtgatcca cctgcctcag cctcccaaag tgttgggatt 780 acaggcattg agccaccgcg cccggccttt cttccctctc ttaaagagtg tttatttaat 840 tccacaaaca tgagcttgtc accccctgta gcctggcatc tcctacacga ggtgatggct 900 gaggcttctg cttctgctgg ggtagctctg atctttctgc tttctctggc actgtctacc 960 catgttgcct caccccacag gtcccagggc acctctctcg ggcaagtctt ggaaccctct 1020 gacactgatt tgctctcttt tctgagctgc ttttagccac ccatcctcgg gacctgtttt 1080 ctctctgcct ccacccctgc gggcagtctt aggtctcctg cccctcacga gcaccccaga 1140 gaggccacgt gctcagtgat ctcagtgggc gcatctttct agtcttgcta ttctttttgg 1200 ccatgttgtt cagaaaccat actgggcagg gccgacttca ccctaaaggc tgcgtctctt 1260 cactctgctt ttgtttgttc caaataaagt ggcttcagaa ttgctaaccc tagcctctgt 1320 gaacttgtga ggtacaattt tgtgtctgtt atgttaacaa aaatacatac ataccttcct 1380 ggtgatggta taaattgcta ttctctattg gaaagcaatt tggaatgaaa atttaaagaa 1440 ccattttaaa atatgctatc ctgcgtacct ccattccacc cacccccagg gatgtagcct 1500 actgaaataa ttttaaagaa gtcaccatat gagagaaaat gttattgcta tattgttatt 1560 gtgagaaatt ggaaatagac taaatgttca gcactatagg aataattaat gaaattacat 1620 atactctata caatcattat gctgccattg aaataataaa tacaaaggcg caagggggga 1680 aaagcttata atgttagtga aactaagact gattttttta taaagcagca gttttcagac 1740 ccttggagac tccaattcgg tagaaccaga gcttcatctt ctctgtcgaa gctgtgacag 1800 gagttgcaaa tgcctctcct ttttgctgag tttgcagctg ctgtttttcc ggcagcacat 1860 ctgtgcaggc ctctgcctcg gcccctctgg atctgctgat tgagcagcgg attgatctgt 1920 ccttctcttt cgtgttgacc catgtgagga accaactggc aagggaacaa gaaatggaaa 1980 taggcctcct ttgcatcatg acctgtacat cctgcaattg gaaaagattg tactttagtt 2040 ggtttaacca gcagcattat ttttctaaac taagcagtaa gaaggaatta ggttttatgt 2100 gggatcaaca gactgggtct caaaagagga aggtgataga acacagtggg gagggggagg 2160 tgcactagaa acagagggcc tatgctttca ttctggcttt gctacttaat agctgtgtga 2220 cccaatctta gagacttaac ctctctgaac ttccattttc tcatgtataa aatgggaaat 2280 attaaaggat actcactggg ctggtggctt gtgcctgtaa tcccagcact tggggaggtt 2340 gaggtgggag gatcacttga gcccaggtgt tcaagaccag cccaggcaac atggcaagac 2400 tctgtctcta tgaaaaaatt aaaaattagc caggtgtggt ggtgtgcacc tgtagtctta 2460 gctacttggt aggctgagat gggaggatca cttgggcttg ggaggtcaag gctgcggtga 2520 gctgtgattc catcactgca ctccagcccg ggcggcagag cgagacactg aatccaaacg 2580 acaacaacaa caaaaggcaa aaaaataaaa gtgccctctt tatggagttg tgtaaggtga 2640 agcatataca ctattcaaca tagtaactat ataaaggaag tattgttgtt gttactgtag 2700 ttaataccat taagtgagat gtttcgtata gtggaaagca catggactct gaattcagac 2760 tggtctgact ttgagtctca gctccacatc tagtaatact atgaccaagc cctggttaaa 2820 atcatgtttt tttttcttca gccttagtct tctcacatat aaaataggga cactgtcatt 2880 tacctcagtt ttctgtgagg ataaaacaac gacagtgtat atgcaagtat tttgtaaatt 2940 ttgtagtgct cctcaagatt tagttggtgt ttactacttg tactttctca ctggaatggc 3000 agatgctgtt ggacagcagg gacaatgacc acttttggga acagcagttg gatggcttag 3060 attggacagc ccaagacatc gtggcgtttt tggccaagca cccagaggat gtccagtcca 3120 gtaatggttc tgtgtacacc tggagagaag ctt 3153 2 7660 DNA Homo sapiens misc_feature (3854)..(3854) n is a, c, g, or t 2 aattcgggtc caattaaatt tttgaaattt tatattaaaa attatattag tagggatggg 60 taagaggtgt tttggtctgg ttggttggtt agttgctatg actcagaatt gctaagaaaa 120 cagaaaagta agataagatc attgttttaa cctcttttcc tccacaaaat caataaataa 180 catatcccta aattactctt agaatttctc ttaaattgca gtgaaaaacc aaaatccttc 240 attcttggtt gaaggttgga aaactacgtt agagaggatt agagagagag gatgagcaat 300 cgtgtagtca gcccttgcct cctagtgtag gatttgtctc agccactgct tgttgtcctg 360 gctgccaacg ttctcatgaa ggctgttctt ctatcagtgt gtcaacctga acaagctaga 420 acccatagca acagaagtct ggctcatcaa caagtccatg gagctgctgg atgagaggaa 480 gttctgggct ggtattgtgt tcactggaat tactccaggc agcattgagc tgccccatca 540 tgtcaagtac aagatccgaa tggacattga caatgtggag aggacaaata aaatcaagga 600 tgggtaagtg gaatcccatc acaccagcct ggtcttgggg aggtccagag cacctattat 660 attaggacaa gaggtacttt attttaacta aaaatttggt agaaatttca acaacaacaa 720 aaaaactcaa cttggtgtca tgattttggt gaaattggta catgacttgc tggaaggttt 780 ttcataggtc ataaaataac agtatctttt gatttagcat ttctactcaa gggaattaat 840 tccaggaatt ttggtggcag gcacctgtaa tcccagctac tcgggaggct gaggcaggag 900 aattgcttga acccaggagg cagaggttgc agtgagctaa gatcgcatca ttgcactccc 960 gcctgggcaa taagagtgaa actccatctc aaaaaaaaaa aaagatacaa aaatagaaaa 1020 aggggcttgg taagggtagt agggttttgg gcaatttttt tttttttttt tttttattgt 1080 atggttctaa aggaatggtt gattacctgt ggtttggttt taggtactgg gaccctggtc 1140 ctcgagctga cccctttgag gacatgcggt acgtctgggg gggcttcgcc tacttgcagg 1200 atgtggtgga gcaggcaatc atcagggtgc tgacgggcac cgagaagaaa actggtgtct 1260 atatgcaaca gatgccctat ccctgttacg ttgatgacat gtaagttacc tgcaagccac 1320 tgtttttaac cagtttatac tgtgccagat gggggtgtat atatgtgtgt gcatgtgcat 1380 gcatgtgtga atgatctgga aataagatgc cagatgtaag ttgtcaacag ttgcagccac 1440 atgacagaca tagatatatg tgcacacact agtaaacctc tttccttctc atccatggtt 1500 gccactttta tctttttatt tttatttttt tttttgagat ggagtctcgc tctgacgccc 1560 aggctggagt gcagtggctc gatctcggct cactgcaacc tttgcctccc gggttcaagc 1620 tattctcctg cctcagcctc cacagtagct gggactacag gctcatgctg ccacgcccgg 1680 ctgacttttt gtattttagt agagacgagg tttcaccatg ttacccaggc tagacttcaa 1740 ctcctgagct caggcaatcc accctccttg gcctcccaaa gtgctgggat tacaggtgtg 1800 agccactgca cccagcccac cactttaatt ttttacactc tacccttttg gtcaaaattt 1860 gctcaatctg caagcttaaa atgtgtcatg acaaacacat gcaagcacat actcacacat 1920 agatgcagaa acagcgtcta aacttataaa agcacagttt atgtaaatgt gtgcacttct 1980 tctccctagg tggtaaacca catttcaaaa caacccaaat aaaactgaac aaagcttctt 2040 cctcttagac tttttagaaa atctttcagt gctgagtcac taagctgcca agttctcatt 2100 gtgggaacta tgcctttgga tgtaatgatt tcttctaaga caatgggcgg aggtgtagtt 2160 attgcagaca tctgaaatat gtaatgtttc ttccagattc tggaaattct cttattctct 2220 gtggttggtg gtggtggtgg gatgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg 2280 tagggatcag gatgcgggag gagctgggtt ctgcttgtat tggttctctg ttttgcattg 2340 aatagtgtgt ttccttgtat ggctatctat agcttttcaa ggtcaccaga aattatcctg 2400 tttttcacct tctaaacaat tagctggaat ttttcaaagg aagactttta caaagacccc 2460 taagctaagg tttactctag aaaggatgtc ttaagacagg gcacaggagt tcagaggcat 2520 taagagctgg tgcctgttgt catgtagtga gtatgtgcct acatggtaaa gctttgacgt 2580 gaacctcaag ttcagggtcc aaaatctgtg tgccttttta ctttgcacat ctgcattttc 2640 tattctagct tggaatctga aacattgaca agagctgcct gaaatgtatg tctgtggtgt 2700 gattagagtt acgataagca agtcaatagt gagatgacct tggagatgtt gaacttttgt 2760 gagagaatga gttgtttttt tgttttggtt tttagtactt taacataatc tacctttagt 2820 ttaagtatcg ctcacagtta cctagttact gaagcaagcc cccaaagaaa tttggtttgg 2880 caacactttg ttagcctcgt ttttctctct acattgcatt gctcgtgaag cattggatca 2940 tacgtacatt tcagagtcta gagggcctgt ccttctgtgg cccagatgtg gtgctccctc 3000 tagcatgcag gctcagaggc cttggcccat caccctggct cacgtgtgtc tttctttctc 3060 cccttgtcct tccttggggc ctccagcttt ctgcgggtga tgagccggtc aatgcccctc 3120 ttcatgacgc tggcctggat ttactcagtg gctgtgatca tcaagggcat cgtgtatgag 3180 aaggaggcac ggctgaaaga gaccatgcgg atcatgggcc tggacaacag catcctctgg 3240 tttagctggt tcattagtag cctcattcct cttcttgtga gcgctggcct gctagtggtc 3300 atcctgaagg taaggcagcc tcactcgctc ttccctgcca ggaaactccg aaatagctca 3360 acacgggcta agggaggaga agaagaaaaa aaatccaagc ctctggtaga gaaggggtca 3420 tacctgtcat ttcctgcaat ttcatccatt tatagttggg gaaagtgagg cccagagagg 3480 ggcagtgact tgcccaaggt caacccagcc gggtagcagc taagtaggat gagagtgcag 3540 ggttcatgct ttccagataa ccacatgctc aactgtgcca tgctgtctca ttggtagtgg 3600 ttcatggcag catctgaaag ctatttattt tcttagatat attgggtggc gattcttcct 3660 aagtttctaa gaacaataat cagaaggata tatattgttg caggttagac tgtctggaag 3720 cagaggctga aatagagttt gatgtatggg tatttatgag ggctcaatac ctatgaagag 3780 atatggaaga tgcaggattg ggcagaggga ggagttgaac tgtgatatag ggccaacccc 3840 gtggggcact ctanagaata tgcagcttgt tggagttgtt nttcatcgag ctgaaacatc 3900 cagccctttg tgctccccca aggcctccct cctgacacca cctacctcag ccctctcaat 3960 caatcactgg atgtgggctg ccctgggaag gtcgtgcccc agggcctaca tggctctctg 4020 ctgctgtgac aaacccagag ttgctgatgc ctgaggccgt ctactgacag ctgggcaaca 4080 aggcttccct gaatggggac tctgggcagt gcagttttgt gtctgaacca tacattaata 4140 tatttatatc cgaattttct ttctctgcaa gcatttcata taaagacaca tcaggtaaaa 4200 ataaatgttt ttgaagcaaa aggagtacaa agagataaga actaactaat ttaatactag 4260 ttaccatctg ttacaaatag ttcctactga ttgccaagga ctgtttaaac acatcacatg 4320 ggcttcttct tctatcctca ctaacccttt taacagacaa ggaaatgagg ctcaggaagg 4380 tcaaggactt tattgaggtt ccacagtagg atacagttct tgctaaaagc aacccctccc 4440 tcatgctctg ttatctaact gcaaggggaa ggtcagtggc agaggtagtg gtcccatggt 4500 tggtgcataa gagctgctct gagacaactg catgctggtg ggtcctgcag acatgtaccc 4560 atcagccgga gataggctca aaatatccac aagagtttgg atgattgtgg gaatgcagaa 4620 tccatggtga tcaagaggga aagtcaagtt gcctggccat tttccttggc ttttagacag 4680 aaaagttacg tgggatatta tctcccacag ctcttctgtg gtgccaccag tcatagtcct 4740 tatataagga gaaaccagtt gaaattacct attgaagaaa caaagagcaa actcgcccac 4800 tgaaatgcgt agaaagccct ggactctgtt gtattcataa ctctgccatt atttttctgc 4860 gtagttttgg gtaagtcact tatcttcttt aggatggtaa tgatcagttg cctcatcaga 4920 aagatgaaca gcattacgcc tctgcattgt ctctaacatg agtaggaata aaccctgtct 4980 tttttctgta gatcatacaa gtgagtgctt gggattgttg aggcagcaca tttgatgtgt 5040 ctcttccttc ccagttagga aacctgctgc cctacagtga tcccagcgtg gtgtttgtct 5100 tcctgtccgt gtttgctgtg gtgacaatcc tgcagtgctt cctgattagc acactcttct 5160 ccagagccaa cctggcagca gcctgtgggg gcatcatcta cttcacgctg tacctgccct 5220 acgtcctgtg tgtggcatgg caggactacg tgggcttcac actcaagatc ttcgctgtga 5280 gtacctctgg cctttcttca gtggctgtag gcatttgacc ttcctttgga gtccctgaat 5340 aaaagcagca agttgagaac agaagatgat tgtcttttcc aatgggacat gaaccttagc 5400 tctagattct aagctcttta agggtaaggg caagcattgt gttttattaa attgtttacc 5460 tttagtcttc tcagtgaatc ctggttgaat tgaattgaat ggaatttttc cgagagccag 5520 actgcatctt gaactgggct ggggataaat ggcattgagg aatggcttca ggcaacagat 5580 gccatctctg ccctttatct cccagctctg ttggctatgt taagctcatg acaaagccaa 5640 ggccacaaat agaactgaaa actcttgatg tcagagatga cctctcttgt cttccttgtg 5700 tccagtatgg tgttttgctt gagtaatgtt ttctgaacta agcacaactg aggagcaggt 5760 gcctcatccc acaaattcct gacttggaca cttccttccc tcgtacagag cagggggata 5820 tcttggagag tgtgtgagcc cctacaagtg caagttgtca gatgtcccca ggtcacttat 5880 caggaaagct aagagtgact cataggatgc tcctgttgcc tcagtctggg cttcataggc 5940 atcagcagcc ccaaacaggc acctctgatc ctgagccatc cttggctgag cagggagcct 6000 cagaagactg tgggtatgcg catgtgtgtg ggggaacagg attgctgagc cttggggcat 6060 ctttggaaac ataaagtttt aaaagtttta tgcttcactg tatatgcatt tctgaaatgt 6120 ttgtatataa tgagtggtta caaatggaat cattttatat gttacttggt agcccaccac 6180 tcccctaaag ggactctata ggtaaatact acttctgcac cttatgattg atccattttg 6240 caaattcaaa tttctccagg tataatttac actagaagag atagaaaaat gagactgacc 6300 aggaaatgga taggtgactt tgcctgtttc tcacagagcc tgctgtctcc tgtggctttt 6360 gggtttggct gtgagtactt tgcccttttt gaggagcagg gcattggagt gcagtgggac 6420 aacctgtttg agagtcctgt ggaggaagat ggcttcaatc tcaccacttc ggtctccatg 6480 atgctgtttg acaccttcct ctatggggtg atgacctggt acattgaggc tgtctttcca 6540 ggtacactgc tttgggcatc tgtttggaaa atatgacttc tagctgatgt cctttctttg 6600 tgctagaatc tctgcagtgc atgggcttcc ctgggaagtg gtttgggcta tagatctata 6660 gtaaacagat agtccaagga caggcagctg atgctgaaag tacaattgtc actacttgta 6720 cagcacttgt ttcttgaaaa ctgtgtgcca ggcagcatgc aaaatgtttt atacacattg 6780 cttcatttaa ttctcacaag gctactctga agtagttact ataataacca gcaattttca 6840 aatgagagaa ctgtgactca aagacgttaa gtaaccagct ttggtcacac aactgttaaa 6900 tgttggtacg tggaggtgaa tccacttcgg ttacactggg tcaataagcc caggcgaatc 6960 ctcccaatgc tcacccaatt ctgtatttct gtgtcctcag agggggtaca actaggagag 7020 gttctgtttc ctgagtacag gttgttaata attaaatata ctagctctaa ggcctgcctg 7080 tgatttaatt agcattcaat aaaaattcat gttgaatttt tctttagtac ttctttctta 7140 atataataca tcttcttgac caagtccaag aggaacctgc gttggacagt tttcatatga 7200 gatcaaattc tgagagagca agatttaacc ctttttggtt caccttctga tcctccccta 7260 aggaggtata catgaaatat ttattactcc tgcctgaact tctttcattg aatatgcaat 7320 tttgcagcat gcagattctg gatttaaatt ctgagtctta acttactggc tgagggacct 7380 tggataggct ccttatccct cagtttcctc atctctaaaa tggggatggc acctgccccg 7440 tgggttgttg gaaggactta cagaggtgca gaatgtacgt tgtacatagc aggtttcagc 7500 aaatgttagc tccctctttc cccacatcca ttcaaatctg ttccttctcc aaaggatgtg 7560 tcaaggagga aatggacctg gctgggaaac cctcagaata ctgggatgat gctgagcttg 7620 gctcatacct gtgctttgct ttcaggccag tacggaattc 7660 3 1285 DNA Homo sapiens misc_feature (24)..(24) n is a, c, g, or t 3 gaattcccag gccctggtat tttncttgca ccaagtccta ctggtttggc gaggaaagtg 60 atgagaagag ccaccctggt tccaaccaga agagaatatc agaaagtaag tgctgttgac 120 ctcctgctct ttctttaacc tagtgctgct gcctctgcta actgttgggg gcaagcgatg 180 tctcctgcct ttctaaaaga ctgtgaaacc actccagggg cagagaaatc acatgcagtg 240 tccctttcca aatcctccca tgccatttat gtccaatgct gttgacctat tgggagttca 300 cggtctcgat ccctgaggga cattttcttt gttgtcttgg cttctagaag agtatctttt 360 acttgccccc tcccaaacac acatttcatg gtctcctaac aagctagaag aaagaggtaa 420 agacaagcgt gattgtggaa ccatagcctc gctgcctgcc tgtgacatgg tgacctgtgt 480 atcagcctgt gtgggctgag accaagtggc taccacagag ctcagcctat gcttcataat 540 gtaatcatta cccagatccc taatcctctc ttggctctta actgcagaca gagatgtcca 600 cagctcatca aaggctctgc cttctgggtt ctttgtgctt agagtggctt cctaaatatt 660 taataggtcc cttttctgcc agtctcttct gtgcccatcc cctgattgcc cttggtaaaa 720 gtatgatgcc ccttagtgta gcacgcttgc ctgctgttcc taatcatctt ctcctacctc 780 ctctttacac ctagctcctg tttcagtcac ctagaaatgc tcacagtcgc tggaatatgt 840 catgttcttc cacacctcca tgcctttgta ggtactgttt gctctcacag gagaactttc 900 tctctaactt gcctatcttc tcaactcctc ctttctctcc aagatctagt tccggatccc 960 ctcccctgag catccctcct tggttctcag gtagtcagtc actctctgcc ctgaacttcc 1020 atggcacgtg aaagaaaatc tttttatttt aaaacaatta cagactcaca agaagtaata 1080 caaattacat gagggggttc ccttaaacct ttcatccagt ttccccaatg gtagcagcat 1140 gtgtaactgt agaatagtat caaaaccatg aaattgacat aggtacaatt cacaaacctt 1200 cttcagattt cactagcttt atgtgcgctc atttgtgtgt gtgtgtgcgt atttagttct 1260 atgcaatttt atcatgtgtg aattc 1285 4 1521 DNA Homo sapiens 4 gatccctggg ccaagggaag gagcacatga ggagttgccg aatgtgaaca tgttatctaa 60 tcatgagtgt ctttccacgt gctagtttgc tagatgttat ttcttcagcc taaaacaagc 120 tggggcctca gatgaccttt cccatgtagt tcacagaatt ctgcagtggt cttggaacct 180 gcagccacga aaagatagat tacatatgtt ggagggagtt ggtaattccc aggaactctg 240 tctctaagca gatgtgagaa gcacctgtga gacgcaatca agctgggcag ctggcttgat 300 tgccttccct gcgacctcaa ggaccttaca gtgggtagta tcaggagggg tcaggggctg 360 taaagcacca gcgttagcct cagtggcttc cagcacgatt cctcaaccat tctaaccatt 420 ccaaagggta tatctttggg gggtgacatt cttttcctgt tttcttttta atcttttttt 480 aaaacataga attaatatat tatgagcttt tcagaagatt tttaaaaggc agtcagaaat 540 cctactacct aacacaaaaa ttgtttttat ctttgaataa tatgttcttg tttgtccatt 600 ttccatgcat gcgatgttag gcatacaaaa tacatttttt aaagaatact ttcattgcaa 660 attggaaact tcgtttaaaa aatgctcata ctaaaattgg catttctaac ccataggccc 720 acttgtagtt atttaccgaa gcaaaaggac agctttgctt tgtgtgggtc tggtagggtt 780 cattagaaag gaatgggggc ggtgggaggg ttggtgttct gttctctctg cagactgaat 840 ggagcatcta gagttaaggg taggtcaacc ctgacttctg tacttctaaa tttttgtcct 900 caggtcaatc ctgaccgggt tgttcccccc gacctcgggc accgcctaca tcctgggaaa 960 agacattcgc tctgagatga gcaccatccg gcagaacctg ggggtctgtc cccagcataa 1020 cgtgctgttt gacatgtgag taccagcagc acgttaagaa taggcctttt ctggatgtgt 1080 gtgtgtcatg ccatcatggg aggagtggga cttaagcatt ttactttgct gtgtttttgt 1140 tttttctttt tttctttttt atttttttga gatggagtct cgctctgtag ccaggctgga 1200 ctgtagtggc gcgatctcgg ctcactgcaa ccttggcctc ccaggttcaa gcgattctcc 1260 tgcctcagcc tcccgagtag ctgggactct aggcacacac caccatgccc agctaatttt 1320 tgtgttttta gtagagacgg ggtttcacca tgttggccag gatggtctca atgtcttgac 1380 ctcgtgatcc gcccacctcg gtctcccaaa gtgctgggaa cacaggcatg agccactgtg 1440 tctggccaca ttttactttc tttgaatatg gcaggctcac ctccgtgaac accttgagac 1500 ctagttgttc tttgatttta g 1521 5 6519 DNA Homo sapiens misc_feature (3523)..(3523) n is a, c, g, or t 5 gaaattgaaa gttgtaactg cctggtgcat ggtggccagg cctgctggaa acaggttgga 60 agcgatctgt cacctttcac tttgatttcc tgagcagctc atgtggttgc tcactgttgt 120 tctaccttga atcttgaaga ttatttttca gaaattgata aagttatttt aaaaagcacg 180 gggagagaaa aatatgccca ttctcatctg ttctgggcca ggggacactg tattctgggg 240 tatccagtag ggcccagagc ttgacctgcc tccctgtccc caggctgact gtcgaagaac 300 acatctggtt ctatgcccgc ttgaaagggc tctctgagaa gcacgtgaag gcggagatgg 360 agcagatggc cctggatgtt ggtttgccat caagcaagct gaaaagcaaa acaagccagc 420 tgtcaggtgc ggcccagagc taccttccct atccctctcc cctcctcctc cggctacaca 480 catgcggagg aaaatcagca ctgccccagg gtcccaggct gggtgcggtt ggtaacagaa 540 acttgtccct ggctgtgccc ctaggtcctc tgccttcact cactgtctgg ggctggtcct 600 ggagtttgtc ttgctctgtt tttttgtagg tggaatgcag agaaagctat ctgtggcctt 660 ggcctttgtc gggggatcta aggttgtcat tctggatgaa cccacagctg gtgtggaccc 720 ttactcccgc aggggaatat gggagctgct gctgaaatac cgacaaggtg cctgatgtgt 780 atttattctg agtaaatgga ctgagagaga gcggggggct tttgagaagt gtggctgtat 840 ctcatggcta ggcttctgtg aagccatggg atactcttct gttatcacag aagagataaa 900 gggcattgag actgagattc ctgagaggag atgctgtgtc tttattcatc tttttgtccc 960 caacatggtg cactaaattt atggttagtt gaaagggtgg atgcttaaat gaatggaagc 1020 ggagaggggc aggaagacga ttgggctctc tggttagaga tctgatgtgg tacagtatga 1080 ggagcacagg caggcttgga gccaactctg gctggccctg agacattggg aaagtcacaa 1140 cttgcctcac cttctttgcc gataataata gtggtgctta cctcatagag gattaaatta 1200 aatgagaatg cacacaaacc acctagcaca atgcctggca tatagcaagt tcccaaataa 1260 aatgctactg ttcttacctc tgtgaggatg tggtacctat atatacaaag ctttgccatt 1320 ctaggggtca tagccataca gggtgaaagg tggcttccag gtctcttcca gtgcttaccc 1380 ctgctaatat ctctctagtc cctgtcactg tgacaaatca gaactgagag gcctcacctg 1440 tcccacatcc ttgtgtttgt gcctggcagg ccgcaccatt attctctcta cacaccacat 1500 ggatgaagcg gacgtcctgg gggacaggat tgccatcatc tcccatggga agctgtgctg 1560 tgtgggctcc tccctgtttc tgaagaacca gctgggaaca ggctactacc tgaccttggt 1620 caagaaagat gtggaatcct ccctcagttc ctgcagaaac agtagtagca ctgtgtcata 1680 cctgaaaaag gtgagctgca gtcttggtgc tgggctggtg ttgggtctgg gcagccagga 1740 cttgctggct gtgaatgatt tctccatctc cacccctttt gccatgttga aaccaccatc 1800 tccctgctct gttgcccctt tgaaatcata tcatacttaa ggcatggaaa gctaaggggc 1860 cctctgctcc cattgtgcta gttctgttga atcccgtttt ccttttccta tgaggcacag 1920 agagtgatgg agaaggtcct tagaggacat tattatgtca aagaaaagag acttgtcaag 1980 aggtaagagc cttggctaca aatgacctgg tgttcctgct cattactttt caatctcatt 2040 gaccttaact tttaaactat aaaacagcca atatttatta ggcactgatt tcatgccaga 2100 gacactctgg gcatgaaaga aagtaatgat aatagttaat tttatatagc gttgttacca 2160 tttacaacct tttttttttt tttaacctct atcatctcaa ttaaagtgca gagagaccct 2220 gggaagaagg taactatatt tattatccca gatgagggaa gtgaggcttg tagggaattg 2280 gtagctgatt caaggtcacc cagcaggtaa ataacagtgg tgggaccaga cccaattacc 2340 aggtatgttt tcctctgtac cgcagtacat gcctgagatt tatttgtgtg ttgaagccag 2400 tggtacctaa tgtatttaca tcccaacctg aaactcctat ccacttattt accttttaat 2460 gagcctctta actcaagtgc agtctgagga ccagcagcat caggatcact tgggaacttg 2520 ttagaaattc agcaacctgg gcccagctca gacctaccga atcagaatct gtgcatttta 2580 acaaggttct tgagtggttg aacacacatt aaagcatgag aagcattgaa ctagacatgt 2640 agccaggtaa aggccttgcc tgagatggtt ggcaaaggcc tcattgcagc attcattggc 2700 aggccacagt tcttttggca gctctgcttc ctgacctttc accctcagga agcgaggctg 2760 ttcacacggc acacacatgc cagacagggt cctctgaagc cacggctgcc agtgcatgtg 2820 tcccagggaa agctttttcc tttagttctc acacaacaga gcttcttgga agccctcccc 2880 ggcaaaggtg ctggtggctc tgccttgctc cgtccctgac ccgttctcac ctccttcttt 2940 gccatcagga ggacagtgtt tctcagagca gttctgatgc tggcctgggc agcgaccatg 3000 agagtgacac gctgaccatc ggtaaggact ctggggtttc ttattcaggt ggtgcctgag 3060 cttcccccag ctgggcagag tggaggcaga ggaggagagg tgcagaggct ggtggcgctg 3120 actcaaggtt tgctgctggg ctggggctgg gtggctgcgg gtgtgggagc agcttggtgg 3180 cgggttggcc taatgcttgc tggggtgcct ggggctcggt ttgggagcta gcagggcagt 3240 gtcccagaga gctgagatga ttggggtttg gggaatccct taggggagtg gacactgaat 3300 accagggatg aggagctgag ggccaagcca ggagggtggg atttgagctt agtacataag 3360 aagagtgaga gcccaggaga tgaggaacag ccttccagat ttttcttggg tagcgtgtgt 3420 aggaggccag tgtcaccagt agcatatgtg gaacagaagt cttgaccctt gctatctctg 3480 cctagtccta atggctggct tttcccagga aggcttctgc ttncatggac ngntagatta 3540 accctttatt taggtaaatg agggaaccta ctttataagc ataggaaagg gtgaagaatc 3600 ttttaagatt cctttactca agttttcttt tgaagaatcc cagagcttag gcaatagaca 3660 ccagactttg agcctcagtt atccattcac ccatccaccc acccacccac ccatccttcc 3720 atcctcccat cctcccattc acccatccac ccatccagct gtccacccat tctacactga 3780 gtacctataa tgtgcctggc tttggtgata caaaggtgaa taagacatag tcctttcctt 3840 tgcccccaac cctcagacca gagatgaaca tgtggaatga cctaaacacc tggaacaggt 3900 gtggtgtatg agcggcaggc ctctgatgag agggtggggg atggccagcc ctcactccga 3960 agcccctctg agttgattga gccatctttg cattctggtc cctgcagatg tctctgctat 4020 ctccaacctc atcaggaagc atgtgtctga agcccggctg gtggaagaca tagggcatga 4080 gctgacctat gtgctgccat atgaagctgc taaggaggga gcctttgtgg aactctttca 4140 tgagattgat gaccggctct cagacctggg catttctagt tatggcatct cagagacgac 4200 cctggaagaa gtaagttaag tggctgactg tcggaatata tagcaaggcc aaatgtccta 4260 aggccagacc agtagcctgc attgggagca ggattatcat ggagttagtc attgagtttt 4320 taggtcatcg acatctgatt aatgttggcc ccagtgagcc atttaagatg gtagtgggag 4380 atagcaggaa agaagtgttt tcctctgtac cacagtacat gcctgagatt tgtgtgttga 4440 aaccagtggt acctaacaca tttacatccc aaccttaaac tcctatgcac ttatttaccc 4500 tttaatgagc ctctttactt aagtacagtg tgaggaacag cggcatcagg atcacttggg 4560 aacttgttag aaatacagca acttgggccc agctcagacc tactgaatca gaatcaggag 4620 caattctctg gtgtgactgt gtcacagcca ggtatcaact ggattctcat acataggaaa 4680 tgacaaacgt ttatggatgg atagtctact tgtgccaggt gctgagattt gttttttgtt 4740 ttttgatttt tttttaatca ctgtgacctc atttaattct caaaaaaaga tgaaaaaatg 4800 aacactcagg aatgctgaca tgagattcag aatcaggggt ttggggcttc aaagtccatc 4860 ctctctttat ccatgtaatg cctcccctta gagatacaac atcacagacc ttgaaggctg 4920 aaggggatat aaaagctgtc tggccaagtg gtctccaagc ttgacagtgc agcagaatca 4980 cctggggata ttattaaaaa taaacatact aaggtttggc ttcagggcct gtgaatcaga 5040 atttctggag gtgaggcctt gaagtctgta tttctattgc atactttgga cacagtggtc 5100 tatagactag agtttggaaa tgattgcgct cattcagatt ctcttctgat gtttgaattg 5160 ctgccatcat atttctagtg ctctatttcc tcctgctcat tctgtcttgg ataacttatc 5220 atagtactag cctactcaaa gatttagagc cacagtcctg aaagaagcca cttgactcat 5280 tccctgtagg ttcagaataa atttcttctg cgcagtgtct gtcatagctt tttttaaatt 5340 tttttttatt tttgatgaga ctggagtttt gctcttattg cccaagctgg agtgcagtgg 5400 tgcgattttg gctcactgca acctccacct cccaggttca agcgattctc ctgcctcagc 5460 ctcccaagta gctgagatta caagcatgtg ctaccacgcc cagctaattt tgtattttta 5520 gtagagatgg gttttatcca tgttggtcag gctggtctcg agctccagac ctcaggtgat 5580 ctgcccgcct cggcctccca aagtgctggg attataggcc tgagccacag cgctcagcca 5640 taactttaat ttgaaaatga ttgtctagct tgatagctct caccactgag gaaatgttct 5700 ctggcaaaaa cggcttctct cccaggtaac tctgagaaag tgttattaag aaatgtggct 5760 tctactttct ctgtcttacg gggctaacat gccactcagt aatataataa tcgtggcagt 5820 ggtgactact ctcgtaatgt tggtgcttat aatgttctca tctctctcat tttccagata 5880 ttcctcaagg tggccgaaga gagtggggtg gatgctgaga cctcaggtaa ctgccttgag 5940 ggagaatggc acacttaaga tagtgccttc tgctggcttt ctcagtgcac gagtattgtt 6000 cctttccctt tgaattgttc tattgcattc tcatttgtag agtgtaggtt tgttgcagat 6060 ggggaaggtt tgttttgttg taaataaaat aaagtatggg attctttcct tgtgccttca 6120 gatggtacct tgccagcaag acgaaacagg cgggccttcg gggacaagca gagctgtctt 6180 cgcccgttca ctgaagatga tgctgctgat ccaaatgatt ctgacataga cccaggtctg 6240 ttagggcaag atcaaacagt gtcctactgt ttgaatgtga aattctctct catgctctca 6300 cctgttttct ttggatggcc tttanccaag gtgatagatc cctacagagt ccaaagagaa 6360 gtgaggaaat ggttaaagcc acttgttttt tgcagcatcg ngcatgtnat caaacctgan 6420 agagcctatc catatcactt tnctttaana gacattaaan atggntcctt aatctctttt 6480 gancccattg tatttattat tctttttctg cgggggtcc 6519 6 7378 DNA Homo sapiens 6 tctagaaaat ttttaggaac agaaaacttt ccagttctct cacccctgct caaagagtgt 60 atggctctta cattatatat aactgcctga cttcatacag tatcagtact tagatcattt 120 gaaatgtgtc cacgttttac caaaatataa tagggtgaga agctgagatg ctaattgcca 180 ttgtgtattc tcaaatatgt caagctacgt acatggcctg tttcatagag tagtctataa 240 gaaattgatg acttgattca tccgaatggc tggctgtaac acctggttac gcatgaacac 300 ctcttttcag ttgtctcaag acacctttct tttctgtact tatcagacaa ggactgaaag 360 gcagagactg ctactgttag acattttgag tcaagctttt ccttggacat agctttgtca 420 tgaaagccct ttacttctga gaaacttcta gcttcagaca catgccttca agatagttgt 480 tgaagacacc agaagaagga gcatggcaat gccgaaaaca cctaagataa taggtgacct 540 tcagtgttgg cttcttgcag aatccagaga gacagacttg ctcagtggga tggatggcaa 600 agggtcctac caggtgaaag gctggaaact tacacagcaa cagtttgtgg cccttttgtg 660 gaagagactg ctaattgcca gacggagtcg gaaaggattt tttgctcagg tgagacgtgc 720 tgttttcgcc agagactctg gcttcatggg tgggctgcag gctctgtgac cagtgaaggc 780 aggatagcat cctggtcaag atatggatgc cggagccaga tttatctgta tttcaatccc 840 agttctattc cttgccagtt gtgtatccgc tggcaagtta cttctctatg cctcaatctc 900 ctcatctgta aaatggggat aataatatta cctgcaatac agggttgtta cgaaaataaa 960 aatgaatagg tgcttagaat ggggcctgac attagtaagt gcttagtttt gtgtgtgtat 1020 atgttatttt tattttggag gagaacataa aaaggacaaa gtgtagaaaa actggttggg 1080 tgtattcagc tgtcataaca tgagagttgt tatgcccaga tgcacttgac atgtgaattt 1140 attagaaaca tgatttttct ctgagttgat gtttaactca aactgataga aaagataggt 1200 cagaatatag ttggccaaca gagaagactt gttagactat tgtctgcatg tcagtgtttg 1260 catgctaact tgcttagtta gaaaggttaa attttttcac tctataaaat caagaaatat 1320 agagaaaagg tctgcagaga gtctttcatt tgatgatgtg gatattgtta agagcgggag 1380 tttggagcat acagagctca agttgaatcc tgactttgct acttattggc tatatgacct 1440 tgggcaagct gcttagtctc tctgatcctc agttaccttt gtttgttgat gatgaccatt 1500 gataacacaa ccataaataa tgacaacata gagatagttc tcattatagt agttgttata 1560 cagaattatt cactcaatgt taattttctg cattgaaatc ccagaacatt agaattgggg 1620 gcattatttg aatctttaag gttataagga atacatttct cagcaataaa tggaaggagt 1680 tttgggttaa cttataaagt atacccaagt catttttttt tcagagaaga tatggtagaa 1740 agtcttagga ggttgaagaa ggaattggat atttattctt tctgagacta tcatgggaga 1800 taatgactat ggttgtccat gattggagcc gttgctgtag agttggtttt attatagtgt 1860 aggatttgaa tgggccatgt gttctcagac ctcagattaa aatgagaaaa ctgaggccag 1920 tggggagcgt gacttcacat gggtacactt gtgctagaga cagaaccagg attcaggact 1980 tctggctcct ggtcctgggt tcatggccca atgtagtctt tctcagtctt caggaggagg 2040 aagggcagga cccagtgttc tgagtcaccc tgaatgtgag cactatttac ttcgtgaact 2100 tcttggctta gtgcctctgc caggtggcca taacctctgg ccttgtgttg ccagagaaaa 2160 ggtttagttt tcaggctcca ttgcttccca gctgccaaga atgccttggt gcagcacagt 2220 cataggccct gcattcctca ttgccgtgct ggttggtcgg ggaggtgggc tggactcgta 2280 gggatttgcc ccttggcctt gtttctaaca cttgccgttt cctgctgtcc ccctgccccc 2340 tccactgcct gggtaaagat tgtcttgcca gctgtgtttg tctgcattgc ccttgtgttc 2400 agcctgatcg tgccaccctt tggcaagtac cccagcctgg aacttcagcc ctggatgtac 2460 aacgaacagt acacatttgt caggtatgtt tgtcttctac atcccaggag ggggtaagat 2520 tcgagcagac caaagatgtt tacgagggcc aagggaatgg acttcagaat tacacggtgg 2580 aatgaatttt actgctgcgg ctcaggtccc tgtataagct aatactgcat gcatagaaca 2640 gcagcgaact aaccctgaat aataggccag tcttctgttg agcctttcag cctctctcct 2700 cttcatccta ctgttgtcag gaacagccac atgtgtttta ggtgaaataa tccacccttg 2760 caaaaatcca tgattaagtt ataaaatatt tggatttgtg gagctgtgtt ttaattctgt 2820 aactgagtca cagggcacac tgtcaaagca tagaacctcc agagacttgt tttctgcaaa 2880 gtataattca tgtaattatt atctattctg ttatatttgg gatgttaggt agtgtttgtt 2940 ctttagataa aaatatcccc cactctgtaa caatacatta aatcaaagaa aaggacaaag 3000 gatttttctg ggtcttgtta gcaggagctt tcttcagtcc tgaaagattt gtagacctgt 3060 agatggggga actgtgtcag tgatacaaaa gggaagcatt taaaaaaaaa aagtatatat 3120 atatatatat atatatgtaa tgtgaattgg cctctttttc tctaagccca cattttcttc 3180 ttacatagtt caggtttact ttattttttc ctttccggct gctgaccctg tattgcccgt 3240 agttgtggaa catagcatgt gtttgtgacc tgtgcctgtt atttttgtgc tttctagttg 3300 tgcatgcaaa gagtacaaag ttttcttgcc ctttcttgga aaatcctgct tgtctgtgcc 3360 aaagggataa ttgtgaaagc acttttgaaa tacttaatga gttgattttc ttcaaattaa 3420 aaaaaatata taaatgtatc tgtgtatgta catgtgtgta cacatacaca cctttataca 3480 tacagcccat ttaaaacaag ctccactttg gagtgctcta cgtcaccctg atgccgaata 3540 cagggccaga gtctgagatc cttctgggtg gtttctgtgt tttgttcatt tctgttttaa 3600 gagcctgtca cagagaaatg cttcctaaaa tgtttaattt ataaaaacat ttttatctct 3660 cgattactgg ttttaatgaa ttactaagct ggctgcctct catgtaccca cagcaatgat 3720 gctcctgagg acacgggaac cctggaactc ttaaacgccc tcaccaaaga ccctggcttc 3780 gggacccgct gtatggaagg aaacccaatc ccgtgagtgc cactttagcc ataagcaggc 3840 ttcttgtgct tgttgcctgg tttgatttct aatatgctgc atttatcaac tgcatgccac 3900 attgtgaccg ccagcatttg ccctttgaat tattattatg ttttatttac aaaaagcgaa 3960 ggtagtaacc gaactaaatt atctaggaac aaacgtttgg agagtcttct aacaccgtgc 4020 aaagcacgtc attacagaca tttgtttact gatttagaac cttaatattt aatttaaata 4080 gcactttaca cttactgatg aaatgctttt cctttctttc tctcccagcc cctgtactta 4140 agtgcttcaa taggctctca ttatatatga tttttaggtt ttgcttatca gcttcttcgc 4200 ttttataatc tgaaaagatg gcatatgaat ttttataaaa agggacactt tcttcttctc 4260 aaattgtata tttttattgt actttccttc aaaaccccct tttaaaaagt aagcagtgga 4320 taaataaatt cagtgaagca tccatatgac ccttaagtga gtgtagggga agggaggtca 4380 ccagatcact gtgagtgaag atggtggaga ggtgaggatc ttatgaggcc gtgctcaagg 4440 ctggtagagg tgggttagtg tttccaggtt taggcagaat ctcagctgag gtcatgaaac 4500 aacagtgatc tctgaaaaat tatggcaagg tgggaaggtg ctggagaatt ggagaggggg 4560 caaacttgac tttcaagttt caatgggaag ataggtgact ctgcacacca cagaacagtg 4620 agcatgataa cctgtttata caaggttcta gagcagattt ctaaatggat agctactgtg 4680 tgcttgtttg ttcttaatta gtattggata gttactaaat acttgttagt acttagtaca 4740 taatgggtgg taaatcctag cagctaatat tggttcccaa ataaccagat gacaaggata 4800 gagaaggaca cagacacggc ctatctggat ttcatggtgc ctttcatttt ccacatgaag 4860 gttgtgtagg gaagatagaa gcatgagatg agatgataat atagttatct ggattcatca 4920 ctggccagct gaaccatatg aactcatgga ttgatgctag cttaggaagg ctctgtagga 4980 gccagaactg ggctgagagc cagcccatag agacaaaaga ggcccggccc tgacatcaga 5040 gggttcaaac atgatgtctg agccccacct acagtctgcc ggaggtggtt ggaaggaaga 5100 gcctttatcc ttacaattct tactgaaatt caaattttta ggttttgcaa aaaaatggtg 5160 gacctgaagg aaatttgaca ggagcatgtc tcagctgtat ttaaatttgt ctcagccaat 5220 ccccttttga atgttcagag tgtaagcttc aggagggcag cgcgtcttag tgtgactttt 5280 ctggtcagtt caggtgcttt aaggagacaa ttagagatca atctggaaaa cttcatttga 5340 atttttaata cataagaaaa caataagaaa tagttaaaaa tatatattta taatatatat 5400 atgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtatatatat atattttatt tatttatttt 5460 tttttgagat ggagtctcgc tctgttgccc aggctggagt gcagtggctc aatcttggct 5520 cactgccacc tctgcctccc aggttcaagt gattctccta cctcagcctc ctgagtagct 5580 gggattacaa gcatgtgcca ccacactggc taatttttct aattttagta gagatggagt 5640 ttcaccatgt tggacaggat ggtcttgaac tcctgactta gtgatccacc cgccttcgcc 5700 tcccaaagtt ctgggattac aggcatgagc catcgtgcct ggcaattata tttaatattt 5760 aataataagg aaataattgc tgtaacttta ctttaaattg tggaattctg aaactggaag 5820 ggaactggaa atgacttgtt gaatcaaatc attttaaact tttattttgc cagtggaaaa 5880 aataagcccc caaaagagca ggggacctgc tgatgtccca cagtaattca gagctggaga 5940 tgaggttgaa ggctttgtgt cttatctcca gggaaaattt gtagacagcg tagctcttta 6000 tgtgacgagc attctcaccc cagtcatccc ccaattctct actcatttga gaacataaat 6060 tggatcttgc cagtctctac tcatttttca gcacatcgag cataagatcc agactctttc 6120 ccaggcctct ctcatctggc tcctctcctc ctcctttatc attactcttc ttcgtagctt 6180 atcctactcc agccatgctg tcttcctatt attcctaaaa agtagaaatg catttcttcc 6240 tagggccttt gtacctgcac ttgccatcgc ttttgctcag aatgttcttt ttgccaagct 6300 tttgcccagc ttgttctcca tcattgttat gttttggctg aaatgtcttc tcttagtagg 6360 ttcattctcc ccagtcactg tctttttatt ttgctttatt ttgggccatc taaggttatc 6420 ttattagtgt atttgttgtt cgtctcctcc atgggcatac acctccatga aggcaggtat 6480 tttcacctta ggccctcgaa tatactggac agcatctggc acgtagtaga tgctcaacga 6540 atgtttgttg tgtgagcaaa tggttggttg attggattga actgagttca gtatgtaaat 6600 atttagggcc tctttgcatt ctattttact tatgtataaa atgatacata atgatgatat 6660 aaatgatgtc acagtgtaca aggctgttgt gggatcaagc aatcaaatga gatcatgctt 6720 gtcttttcca aatggtgagg gaatagatgc atgtttgtgg ttgttacgga atgatcctgt 6780 gctcctgagg caacagaaag gccaggccat ctctggtaat cctactcttg ctgtcttccc 6840 tttgcagaga cacgccctgc caggcagggg aggaagagtg gacactgccc cagttcccca 6900 gaccatcatg gacctcttcc agaatgggaa ctggacaatg cagaaccctt cacctgcatg 6960 ccagtgtagc agcgacaaaa tcaagaagat gctgcctgtg tgtcccccag gggcaggggg 7020 gctgcctcct ccacaagtga gtcactttca gggggtgatt gggcagaagg ggtgcaggat 7080 gggctggtag cttccgcttg gaagcaggaa tgagtgagat atcatgttgg gagggtctgt 7140 ttcagtcttt tttgtttttt gttttttttt ctgaggcgga gtcttgctct gtcgcccagg 7200 ctggagtgct gtggcatgat cttgcctcac tgcaacctcc acctcccagg ttcaagcgat 7260 tctcctgcct cagcctcctg agtagctggg attacaggca cgcaccacca tgtctggcta 7320 atttttgtgt ttttagtaga gatagggttt cgccgtgttg gctaggctgg tctggaat 7378 7 5689 DNA Homo sapiens 7 gaattcctga cctcaggtga tccacccgcc tcggcctccc aaagtgctgg gattacaggc 60 gtgagccact acgcccagcc ctgtttcagt ctttaactcg cttcttgtca taagaaaaag 120 catgtgagtt ttgaggggag aaggtttgga ccacactgtg cccatgcctg tcccacagca 180 gtaaagtcac aggacagact gtggcaggcc tggcttccaa tcttggctct gcaacaaatg 240 agctggtagc ctttgacagg cctgggcctg tttcttcacc tctgaattag ggaggctgga 300 ccagaaaact cctgtggatc ttgtcaactc tggtattctt agagactctg tttgggaagg 360 agtcctgagc catttttttt ttcttgagaa tttcaggaag aggagtgctt atgatagctc 420 tctgctgctt ttatcagcaa ccaaattgca ggatgaggac aagcaattct aaatgagtac 480 aggaactaaa agaaggcttg gttaccactc ttgaaaataa tagctagtcc aggtgcgggg 540 tggctcacac ctgtaatctc agtattttgg gatgccgagg tggactgatc acctaaggtc 600 aggagttcga aaccagcttg gccaatgtgg cgaaaccctg tctctactaa aaattcaaaa 660 attagccagg catggtggca catgcctgta atcccagtta cttgggaggc tgaagcagga 720 gaattgcttg aacctgggag gtggaggtcg cagggagcca aaattgcgcc actgtactcc 780 agcctgagca acacagcaaa actccatatc aaaaaataaa atgaataaaa taacagctaa 840 tctagtcatc agtataactc cagtgaacag aagatttatt aggcatagtg aatgatggtg 900 cttcctaaaa atctcttgac tacaaagaat ctcatttcaa tgtttattgt ttagatgttc 960 agaataaatt cttgggaaag accttggctt ggtgtaagtg aattaccagt gccgagggca 1020 gggtgaacca agtctcagtg ctggttgact gagggcagtg tctgggacct gtagtcaggt 1080 ttccggtcac actgtggaca tggtcactgt tgtccttgat ttgttttctg tttcaattct 1140 tgtctataaa gacccgtatg cttggttttc atgtgatgac agagaaaaca aaacactgca 1200 gatatccttc aggacctgac aggaagaaac atttcggatt atctggtgaa gacgtatgtg 1260 cagatcatag ccaaaaggtg actttttact aaacttggcc cctgccgtat tattactaat 1320 tagaggaatt aaagacctac aaataacaga ctgaaacagt gggggaaatg ccagattatg 1380 gcctgattct gtctattgga agtttaggat attatcccaa actagaaaag atgacgagag 1440 ggactgtgaa cattcagttg tcagcttcaa ggctgaggca gcctggtcta gaatgaaaat 1500 agaaatggat tcaacgtcaa attttgccac ttagtagcaa cttgaccagg taactggtta 1560 tccttttaaa gccttagttt atctaaattg tgatattaat gttgctctta taagtttgtc 1620 atgaggacta aattaaatgg tgtacataga gtgccttggg tactctctga tgggggactc 1680 catgataatt tgtggtctca tggagggagc tctgggaagg tttaggagcc tgccttggct 1740 ctgcagcctt gggagagcct tctagcttcc caggacatgg cagcctagtg ttgaatgctt 1800 ggctcagcaa atgtttgttc tcgtttcctt cccatcaact tggtcagttg gggtctttca 1860 gttaggagta tctcagtgac tttaaatggc atgggcatgc tggagtgata gtgaccatga 1920 gtttctaaga aagaagcata atttctccat atgtcatcca caattgaaat attattgtta 1980 attgaaaaag cttctaggcc aggcacggtg gctcatgcct gtaatcccag cactttagga 2040 ggccaaggcg ggtggatcac ttgaggtcag gagtttgaga ccagcctggc caacatgggg 2100 aaaccctgtc tctactaaaa atacaaaata agctgggcgt ggtggtgcgt gcctgtaatc 2160 ccagctactt gggaggctga ggcaggagaa ttgcttgaat ctgggaggcg gaggttgcag 2220 tgagctgagt tcatgccatt gcattccagc ctgggcaaca agagcgaaac catctcccaa 2280 aagaaaaaaa aaagaaagaa aaagcttcta gtttggttac atcttggtct ataaggtggt 2340 ttgtaaattg gtttaaccca aggcctggtt ctcatataag taatagggta tttatgatgg 2400 agagaaggct ggaagaggcc tgaacacagg cttcttttct ctagcacaac cctacaaggc 2460 cagctgattc tagggttatt tctgtccgtt ccttatatcc tcaggtggat atttactcct 2520 tttgcatcat taggaatagg ctcagtgctt tctttgaact gattttttgt ttctttgtct 2580 ctgcagctta aagaacaaga tctgggtgaa tgagtttagg taagttgctg tctttctggc 2640 acgtttagct cagggggagg atggtgttgt aggtgtcttg gattgaagaa agccttgggg 2700 attgtttgtc actcacacac ttgtgggtgc catctcactg tgaggaggac agaagccctg 2760 tgaacatgtg gagcacacag gggcacagac agatttagat taggcctgct ttatagagtt 2820 tctgcctaga gcatcatggc tcagtgccca gcagcccctc cagaggcctc tgaaatattt 2880 gatatactga tttccttgag gagaatcaga aatctcctgc aggtgtctag ggatttcaag 2940 taagtagtgt tgtgagggga atacctactt gtactttccc cccaaaccag attcccgagg 3000 cttcttaagg actcaaggac aatttctagg catttagcac gggactaaaa aggtcttaga 3060 ggaaataaga agcgccaaaa ccatctcttt gcactgtatt tcaacccatt tgtccttctg 3120 ggttttgaag gaacaggtgg gactggggac agaagagttc ttgaagccag tttgtccatc 3180 atggaaaatg agataggtga tgtggctacg tcagggggcc cgaaggctcc ttgttactga 3240 tttccgtctt ttctctctgc cttttcccca agggccagga cccctggatc tctgggcaga 3300 gcagacgcag gcccctataa tagccctcat gctagaaagg agccggagcc tgtgtataag 3360 gccagcgcag cctactctgg acagtgcagg gttcccactc tcccaactcc ccatctgctt 3420 gcctccagac ccacattcac acacgagcca ctgggttgga ggagcatctg tgagatgaaa 3480 caccattctt tcctcaatgt ctcagctatc taactgtgtg tgtaatcagg ccaggtcctc 3540 cctgctgggc agaaaccatg ggagttaaga gattgccaac atttattaga ggaagctgac 3600 gtgtaacttc tctgaggcaa aatttagccc tcctttgaac aggaatttga ctcagtgaac 3660 cttgtacaca ctcgcactga gtctgctgct gatgatactg tgcaccccac tgtctgggtt 3720 ttaatgtcag gctgttcttt taggtatggc ggcttttccc tgggtgtcag taatactcaa 3780 gcacttcctc cgagtcaaga agttaatgat gccatcaaac aaatgaagaa acacctaaag 3840 ctggccaagg taaaatatct atcgtaagat gtatcagaaa aatgggcatg tagctgctgg 3900 gatataggag tagttggcag gttaaacgga tcacctggca gctcattgtt ctgaatatgt 3960 tggcatacag agccgtcttt ggcatttagc gatttgagcc agacaaaact gaattactta 4020 gttgtacgtt taaaagtgta ggtcaaaaac aaatccagag gccaggagct gtggctcatg 4080 cctgtaatcc tagcactttg ggaggccgaa gcgggtggat cacttgaggt caggagttcg 4140 agaccagcct ggcctacatg acaaaacccc gtatctacta aaaatacaaa aaaattagct 4200 gggcttggtg gcacacacct gtaatcccag ctacttggga ggctgaggca ggagaattgc 4260 ttgaaccctg taggaagagg ttgtagtgag ccaagatcgc accgttgcac tccagcctgg 4320 gcaacaagag caaaactcca tctcaaaaaa caaattaaat ccagagattt aaaagctctc 4380 agaggctggg cgcggtggct tacacctgtt atcccagcat tttgggatgc cgaggcgggc 4440 aaagcacaag gtcaggagtt tgagaccagc ctggccaaca tagtgaaacc ctgtctctgc 4500 taaaaacata gaaaaattag ccgggcatgg tggcgtgcgc ctgtaatccc agctactcgg 4560 gaggctgagg tgagagaatt acttgaaccc gggaggcgga ggttgcagtg agcccagatt 4620 gcaccactgc actccagcct gggcgacaga gcaagactcc atctcaaaaa aagctctcag 4680 aacaaccagg tttacaaatt tggtcagttg gtaaataaac tgggtttcaa acatactttg 4740 ctgaaacaat cactgactaa ataggaaatg aatctttttt tttttttttt aagctggcaa 4800 gctggtctgt aggacctgat aagtactcac ttcatttctc tgtgtctcag gtttcccatt 4860 tttaggtgag aattaagggg ctctgataaa acagacccta ggattgtgga cagcagtgat 4920 agtcctagag tccacaagtc tgcttttgag tgatgggccc atgtatctgg cacatctgca 4980 ggcagagcgt ggttctggct cttcagatga tgccggtgga gcactttgag gagtcctcac 5040 cccaccgtga taaccagaca ttaaaatctt ggggctttgc atcccaggat ttctctgtga 5100 ttccttctag acttgtggca tcatggcagc atcactgctg tagatttcta gtcacttggt 5160 tctcaggagc cgtttattta atggcttcac atttaatttc agtgaacaag gtagtggcat 5220 tgctcttcac agggccgtcc tgttgtccac aggttccaga ttgactgttg ccccttatct 5280 atgtgaacag tcacaactga ggcaggtttc tgttgtttac aggacagttc tgcagatcga 5340 tttctcaaca gcttgggaag atttatgaca ggactggaca ccaaaaataa tgtcaaggta 5400 aaccgctgtc tttgttctag tagctttttg atgaacaata atccttatgt ttcctggagt 5460 actttcaact catggtaaag ttggcagggg cattcacaac agaaaagagc aaactattaa 5520 ctttaccagt gaggcagtac ggtgtagtgt agtgattcag agaatttgct ttgccaccag 5580 acataccagg taaccttgac taagttactt aacctatcta aacctcagtt ccctcatctg 5640 tgaaatggag acagtaatca tagctatttc caaactgttg tgagaattc 5689 8 645 DNA Homo sapiens 8 gaattcaatg agttaaaggt ataaggtcct caccacagcg cctgcccaca tagtcagtga 60 tcactatgtc ctgaacactg taattacttc gccatattct ctgatcatag tgttttgcct 120 tggtatgtga ctagaatttc tttctgaggt ttatgggcat ggttggtggg tatgcacctg 180 cctgcaggag cccggtttgg gggcattacc ttgtacctgg tatgttttct ttcaggtgtg 240 gttcaataac aagggctggc atgcaatcag ctctttcctg aatgtcatca acaatgccat 300 tctccgggcc aacctgcaaa agggagagaa ccctagccat tatggaatta ctgctttcaa 360 tcatcccctg aatctcacca agcagcagct ctcagaggtg gctctgtaag tgtggctgtg 420 tctgtataga tggagtgggg caagggagag ggttatggag aaggggagaa aaatgtgaat 480 ctcattgtag gggaacagct gcagagaccg ttatattatg ataaatctgg attgatccag 540 gctctgggca gaagtgataa gtttacgaat tggctggttg ggcttcttga actgcagaag 600 agaaaatgac actgatatgt aaaaatcgta acatttagtg aattc 645 9 1664 DNA Homo sapiens 9 gaattcatat aaagtgagtt caaaaattgt taattaaatt ataatttaat tataagtgtt 60 taatcagttt gatttgttta aaaaccactg ttttaaattt ggtggaatat gtttttatta 120 gcttgtatct ttaattccta aattaagctg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg 180 tgtgtgtgtg aagtttaaag ccaggatgag ctagtttaaa gtatgcagcc tttggagtca 240 tacagatctg ggtttgaatc tggtctctaa actttataga tgtatgatat taaatgaggc 300 agttcatgta aattgccaag cccagcactc agcacagagt tgatatttca cacacattag 360 atacctttcc tgtatgtgga gcatggcagt tcctgtttct gctttactcc tacaggatac 420 taatatagga cactaggatc tttataccaa gaccccatgt aatgggctta tgagaccatt 480 cttcttataa aaatctgaca gaatttttgt atgtgttaga tcaataggct gcatactgtt 540 attttcaagt tgatttacag ccagaaatat taatttattt gagtagttac agagtaatat 600 ttctgctctc atttagtttt caagccccac tagtcctttg tgtgtgaaaa tttacaactt 660 actgctctta caaggtcatg aacagtggac caaagtgaat gccattaacc actctgactt 720 ccttcattag ttttattgtg acagtggact cttttgacct cagtaatacc agtttggcat 780 ttacattgtc atatttttag acttaaaaat gatcatctta accctgaata aaatgtgtct 840 ggtgaacaga tgttttttct tgggctgtgc ctcagatatc tcctgtgtgt gtgtacgtgt 900 gtgtttgtct gtgtgtccat gtcctcactg attgagccct agctgcatca aaagacccct 960 cagattttca cacgcttttt ctctccagga tgaccacatc agtggatgtc cttgtgtcca 1020 tctgtgtcat ctttgcaatg tccttcgtcc cagccagctt tgtcgtattc ctgatccagg 1080 agcgggtcag caaagcaaaa cacctgcagt tcatcagtgg agtgaagcct gtcatctact 1140 ggctctctaa ttttgtctgg gatatggtaa ggacacaggc ctgctgtatc tttctgatgt 1200 ctgtcagggc catggattga tatggataag aaagaaagag ctctggctat catcaggaaa 1260 tgttccagct actctaaaga tgtatgaaaa agaaatagcc agaggcaggt gatcactttc 1320 atgacaccaa acacagcatt gggtaccaga gttcatgtca caccagaggg aaaattctgt 1380 acacaatgat gaaaattaat accactacca cttaagttcc tatgtgacaa ctttcccaag 1440 aatcagagag atacaagtca aaactccaag tcaatgcctc taacttctct gatgggtttt 1500 aacctccaga gtcagaatgt tctttgcctt actaggaaag ccatctgtca tttgaaaact 1560 ctgtacattt tatcagcagc ttatccatcc attgcaaata tgtttttgtg ccagccacaa 1620 tatattgctt ctatttggac caataggggg atttgaagga attc 1664 10 1279 DNA Homo sapiens 10 gaattctcat aattgtccta tcgtcaagtc tttatttctg cattttactg cttgatacac 60 tgtcaggaca gactttaaaa ttattctcag tgcgatgaaa caattctgac attcatgtta 120 tgagcagtta cctcataaat agattacatg tgagattgaa cttgggcaga ctataatata 180 gcattaatga cgaaacagac acagtcatct tcgggaagaa gaatagaggc ttatttgctg 240 cctgtgaaat taaaattact ctgactggga atccatcgtt cagtaagttt actgagtgtg 300 acaccttggc ttgactgttg gaaagacaga aagggcatgt agtttataaa atcagccaag 360 gggaaaatgc ttgtcaaaat gtattgtcgg gtattttgat taatagttta tgtggcttca 420 ttaattcaga gttactctcc aatatgttta tctgcccttt cttgtctgat aatggtgaaa 480 acttgtgtga tgcattgtat atttgattta ggggtgaact ggatgtcttt gttttcactt 540 ttagtgcaat tacgttgtcc ctgccacact ggtcattatc atcttcatct gcttccagca 600 gaagtcctat gtgtcctcca ccaatctgcc tgtgctagcc cttctacttt tgctgtatgg 660 gtaagtcacc tctgagtgag ggagctgcac agtggataag gcatttggtg cccagtgtca 720 gaaggagggc agggactctc agtagacact tatctttttg tgtctcaaca ggtggtcaat 780 cacacctctc atgtacccag cctcctttgt gttcaagatc cccagcacag cctatgtggt 840 gctcaccagc gtgaacctct tcattggcat taatggcagc gtggccacct ttgtgctgga 900 gctgttcacc gacaatgtga gtcatgcaga gagaacactc ctgctgggat gagcatctct 960 gggagccaga ggacagtgtt taattgtgat cttattccac ttgtcagtgg tattgacact 1020 gctgactgcc ttgtcctgtc ttcagagtct gtcttccctg agaaggcaaa gcacctttct 1080 ttcttgctgt gccttacatt ttgctggtca agcctttcag tttcttttga cagttttttt 1140 tacttctttc ttttttcaat gttgctctta ccaagagtag ctcctctgcc ttccacttta 1200 cacatgagag ctgggcgacg ccattcagtc ctaaggcttt taccatcacc tctcttggtg 1260 tttttattgt catctctaa 1279 11 1124 DNA Homo sapiens 11 tttaattgat tcactaggat atatgctact gaaaggggaa tctgcttaaa gtgctttctg 60 atatttatta ttactaaaac ttagaattta ttaaaaatac tgactgtgaa aaattacttg 120 ggtcgtttgc ctttttaaaa ggatttttgg catgtctcat taaaaaaaga aatactagat 180 atcttcagtg aagttacaaa tcgaatacac attggctctg aaattctgat tgatactggg 240 tcataaaaag ttttcccaaa tcagacttgg aaagtgatca ctctcttgtt actctttttt 300 ccttgtcatg ggtgatagcc atttgtgttt attggaagat cggtgaattt taaggaacat 360 aggcccaaat ttgaggaagg gccatggttt ttgatccctc cattctgacc ggatctctgc 420 attgtgtcta ctaggggaga atcgctttgt gtcaccatta tcttgggact tggtgggacg 480 aaacctcttc gccatggccg tggaaggggt ggtgttcttc ctcattactg ttctgatcca 540 gtacagattc ttcatcaggc ccaggtgagc tttttcttag aacccgtgga gcacctggtt 600 gagggtcaca gaggaggcgc acagggaaac actcaccaat gggggttgca ttgaactgaa 660 ctcaaaatat gtgataaaac tgattttcct gatgtgggca tcccgcagcc ccctccctgc 720 ccatcctgga gactgtggca agtaggtttt ataatactac gttagagact gaatctttgt 780 cctgaaaaat agtttgaaag gttcattttt cttgtttttt cccccaagac ctgtaaatgc 840 aaagctatct cctctgaatg atgaagatga agatgtgagg cgggaaagac agagaattct 900 tgatggtgga ggccagaatg acatcttaga aatcaaggag ttgacgaagg tgagagagta 960 caggttacaa tagctcatct tcagtttttt tcagctttat gtgctgtaac ccagcagttt 1020 gctgacttgc ttaataaaag ggcatgtgtt cccaaaatgt acatctatac caaggttctg 1080 tcaattttat tttaaaaaca ccatggagac ttcttaaaga attc 1124 12 729 DNA Homo sapiens 12 gaattcccat tctcgaatac attggtttta tatgcttaca tttatgtgtt agttattaaa 60 acatactaat attgtatatc tagtcaaaac tgaggtagag agaataaatg gttgattttg 120 agtttgagtt tcatagtcca aaaagctgat atattgcctg tgttcaagag ggtctatatc 180 agccctctag atgccagcat ctccaaattt tacttttttg gaatctgtac agtatttgca 240 atatttttat tacaaatttc tactctgtgg aatttaattt ttaaaatacc tgcaatacat 300 atatatgttg aatagatgaa aaattatgta gataataatg aatgatacgg ttctaaaaag 360 acaggttaaa aagtaagttc acttttattt tgagcttcag aatcattcag aagccagtcg 420 ccacaaacgc agaccaaggc tcttggcaca tcaaatatgc ctatggctta gggttattga 480 caagtcttat gttgcagtgt atgtggttta tagtcctgcc ttccacagtt gcttgggaga 540 gctgtgagtc actgaggctt atgaatgttt acattttgtt tgttgcagat atatagaagg 600 aagcggaagc ctgctgttga caggatttgc gtgggcattc ctcctggtga ggtaaagaca 660 ctttgtctat attgcgtttg tccctattag ttcagactat ctctacccaa tcaagcaacg 720 atgctcgtt 729 13 731 DNA Homo sapiens 13 acatgtgcca gtactggtga gagcgcaagc tttggagtca aacacaaatg ggtttgcatc 60 ctggccctac caattatgag ctctgagcca tgggcaagtg actaactccc tgggcctcag 120 tttctctgta acatctgtca gacttcatgg gtccaggtga ggattaaagg agatcatgta 180 tttacagcac atggcatggt gcttcacata aaataagtat ttagtaaatg ataactggtt 240 ccttctctca gaaacttatt tctgggcctg ccaggggccg ccctttttca tggcacaagt 300 tgggttccca gggttcagta ttcttttaaa tagttttctg gagatcctcc atttgggtat 360 tttttcctgc tttcaggttt ggagatggtt atacaatagt tgtacgaata gcagggtcca 420 acccggacct gaagcctgtc caggatttct ttggacttgc atttcctgga agtgttctaa 480 aagagaaaca ccggaacatg ctacaatacc agcttccatc ttcattatct tctctggcca 540 ggatattcag catcctctcc cagagcaaaa agcgactcca catagaagac tactctgttt 600 ctcagacaac acttgaccaa gtaagctttg agtgtcaaaa cagatttact tctcagggtg 660 tggattcctg ccccgacact cccgcccata ggtccaagag cagtttgtat cttgaattgg 720 tgcttgaatt c 731 14 3501 DNA Homo sapiens 14 gaattcttca acagggaaaa cagctagctt gaaaacttgc tgaaaaacac aacttgtgtt 60 tatggcattt agtaccttca aataattggc tttgcagata ttggataccc cattaaatct 120 gacagtctca aatttttcat ctcttcaatc actagtcaag aaaaatataa aaacaacaaa 180 tacttccata tggagcattt ttcagagttt tctaacccag tcttattttt ctagtcagta 240 aacatttgta aaaatactgt ttcactaata cttactgtta actgtcttga gagaaaagaa 300 aaatatgaga gaactattgt ttggggaagt tcaagtgatc tttcaatatc attactaact 360 tcttccactt tttccaaaat ttgaatatta acgctaaagg tgtaagactt cagatttcaa 420 attaatcttt ctatattttt taaatttaca gaatattata taacccactg ctgaaaaaga 480 aaaaaatgat tgttttagaa gttaaagtca atattgattt taaatataag taatgaaggc 540 atatttccaa taactagtga tatggcatcg ttgcatttta cagtatcttc aaaaatacag 600 aatttataga ataatttctc ctcatttaat atttttcaaa atcaaagtta tggtttcctc 660 attttactaa aatcgtattc taattcttca ttatagtaaa tctatgagca actccttact 720 tcggttcctc tgatttcaag gccatatttt aaaaaatcaa aaggcactgt gaactatttt 780 gaagaaaaca caacatttta atacagattg aaaggacctc ttctgaagct agaaacaatc 840 tatagttata catcttcatt aatactgtgt taccttttaa aatagtaatt ttttacattt 900 tcctgtgtaa acctaattgt ggtagaaatt tttaccaact ctatactcaa tcaagcaaaa 960 tttctgtata ttccctgtgg aatgtaccta tgtgagtttc agaaattctc aaaatacgtg 1020 ttcaaaaatt tctgcttttg catctttggg acacctcaga aaacttatta acaactgtga 1080 atatgagaaa tacagaagaa aataataagc cctctataca taaatgccca gcacaattca 1140 ttgttaaaaa acaaccaaac ctcacactac tgtatttcat tatctgtact gaaagcaaat 1200 gctttgtgac tattaaatgt tgcacatcat tcattcactg tatagtaatc attgactaaa 1260 gccatttgtc tgtgttttct tcttgtggtt gtatatatca ggtaaaatat tttccaaaga 1320 gccatgtgtc atgtaatact gaaccacttt gatattgaga cattaatttg taccctgtgt 1380 tattatctac tagtaataat gtaatactgt agaaatattg ctctaattct tttcaaaatt 1440 gttgcatccc ccttagaatg tttctatttc cataaggatt taggtatgct attatccctt 1500 cttataccct aagatgaagc tgtttttgtg ctctttgttc atcattggcc ctcattccaa 1560 gcactttacg ctgtctgtaa tgggatctat ttttgcactg gaatatctga gaattgcaaa 1620 actagacaaa agtttcacaa cagattttct aagttaaatc attttcatta aaaggaaaaa 1680 agaaaaaaaa tttttgtatg tcaataacct ttatatgaag tattaaaatg catatttcta 1740 tgttgtaata taatgagtca caaaataaag ctgtgacagt tctgttggtc tacagaaatt 1800 tacttttgtg catttgtggc accacctact gttgaagggt tataaagcca ttagaaaagt 1860 agaggggaag tgatttggat caaaaggaaa aactttagaa aagattcaaa tgttccctta 1920 atcataaaag agaactgagg ggactacttg aaaataaaag gttgttttgt attttcatgt 1980 tggttaagat actgagtaac tggtattaag tgttagaggt ttttagataa atattctgct 2040 taatgattat gaagctgcac tgagatttct gaaaatgctc tgtagctgag cttatttaat 2100 aaatgttcac ttggtatagg ggaagctaca aaggcagcct tcagtgtcct tttgtttatt 2160 caaccaaaaa tataaggaca caatgtagca gttatactgg gaaggtgctg ggggtggtgg 2220 caatggtgag caggaaggcg aagtagatat ggaaacagaa atgatactaa tatcggtgat 2280 tccttccttt tttcctgtaa taagtgctgt gcagacaaca tatgagcagt gctgataaat 2340 gtaaatgtat ttttcatagc tcattaagaa tcagtttcag aaagagatgt ctgcttattt 2400 tgctacttga agaatccctg tcaaacagtc cttttgagga agtacaagag gctgtctcta 2460 tttgtgacct caggaatggc tgtgacagtg tcgtgagcag tccttttcct gtggcacaga 2520 tctgaacttt gtgtgcagaa aaatcttggc ttcaagtgag ccaagatgcc ccctgagcat 2580 cagcatcaca acttcatcct cctatcttga agttcatgtt atagtgactt taatgaaatc 2640 atagaacact gtttcttcgt gaacaatgac gagggagagg aaaaaacttt attgaaaaat 2700 aaaaaggcag gtaatttaga tgaaaatatg ttacccatga ggttttgttt ttgctttttg 2760 tttttgtttt tgagaaacag aatctcgctc tgtcgtccag gctggagtgc agcggcatga 2820 tcttggctca ctgcaacctc cgcctcccgg gttcaagcga ttctcctcag cttcccaagt 2880 agctggtact acaggcatgc gccaccacaa ccagctaatt tttgtatttt tagtagagat 2940 ggggtttcac tatacgttgg ccaggctggt ctcaaactcc tgacctaagg tgatccttct 3000 gccttgggct cccaaagtgc tgggattaca ggcatgagcc accttgcctg gccctaccca 3060 tgagccttga ctaaaacatt cttctatctg tagaaaagcc caaaagaact tttccagatt 3120 caaaaaactt ggcactttgt aatggtaatg tttacattaa gtaaaaaaaa aaaaaaaaaa 3180 cccacttagc ttcagttttc aagtgtttac tgtgttgtca tgcacttcat ttaattctca 3240 acacctgccc tatgaggtaa aaagtaccat tttacatatg agtaaattac agctcagtgg 3300 ataagaaact cgtccaaagg tacaggttca gtcaagtggc agagggttct ttttgttgaa 3360 gttaggtatc agttaaaatt gaccttgtaa aatcacatca gcatcaatat acattaattt 3420 aacaaatatt tattgaactt tactgtatgc cagatacttc tctaggtact agggggtaca 3480 atgtagaaga aaatagaatt c 3501 15 151 DNA Homo sapiens 15 atgctgttgg acagcaggga caatgaccac ttttgggaac agcagttgga tggcttagat 60 tggacagccc aagacatcgt ggcgtttttg gccaagcacc cagaggatgt ccagtccagt 120 aatggttctg tgtacacctg gagagaagct t 151 16 206 DNA Homo sapiens 16 tgtgtcaacc tgaacaagct agaacccata gcaacagaag tctggctcat caacaagtcc 60 atggagctgc tggatgagag gaagttctgg gctggtattg tgttcactgg aattactcca 120 ggcagcattg agctgcccca tcatgtcaag tacaagatcc gaatggacat tgacaatgtg 180 gagaggacaa ataaaatcaa ggatgg 206 17 177 DNA Homo sapiens 17 gtactgggac cctggtcctc gagctgaccc ctttgaggac atgcggtacg tctggggggg 60 cttcgcctac ttgcaggatg tggtggagca ggcaatcatc agggtgctga cgggcaccga 120 gaagaaaact ggtgtctata tgcaacagat gccctatccc tgttacgttg atgacat 177 18 223 DNA Homo sapiens 18 ctttctgcgg gtgatgagcc ggtcaatgcc cctcttcatg acgctggcct ggatttactc 60 agtggctgtg atcatcaagg gcatcgtgta tgagaaggag gcacggctga aagagaccat 120 gcggatcatg ggcctggaca acagcatcct ctggtttagc tggttcatta gtagcctcat 180 tcctcttctt gtgagcgctg gcctgctagt ggtcatcctg aag 223 19 222 DNA Homo sapiens 19 ttaggaaacc tgctgcccta cagtgatccc agcgtggtgt ttgtcttcct gtccgtgttt 60 gctgtggtga caatcctgca gtgcttcctg attagcacac tcttctccag agccaacctg 120 gcagcagcct gtgggggcat catctacttc acgctgtacc tgccctacgt cctgtgtgtg 180 gcatggcagg actacgtggg cttcacactc aagatcttcg ct 222 20 205 DNA Homo sapiens 20 agcctgctgt ctcctgtggc ttttgggttt ggctgtgagt actttgccct ttttgaggag 60 cagggcattg gagtgcagtg ggacaacctg tttgagagtc ctgtggagga agatggcttc 120 aatctcacca cttcggtctc catgatgctg tttgacacct tcctctatgg ggtgatgacc 180 tggtacattg aggctgtctt tccag 205 21 15 DNA Homo sapiens 21 gccagtacgg aattc 15 22 105 DNA Homo sapiens misc_feature (24)..(24) n is a, c, g, or t 22 gaattcccag gccctggtat tttncttgca ccaagtccta ctggtttggc gaggaaagtg 60 atgagaagag ccaccctggt tccaaccaga agagaatatc agaaa 105 23 132 DNA Homo sapiens 23 gtcaatcctg accgggttgt tccccccgac ctcgggcacc gcctacatcc tgggaaaaga 60 cattcgctct gagatgagca ccatccggca gaacctgggg gtctgtcccc agcataacgt 120 gctgtttgac at 132 24 143 DNA Homo sapiens 24 gctgactgtc gaagaacaca tctggttcta tgcccgcttg aaagggctct ctgagaagca 60 cgtgaaggcg gagatggagc agatggccct ggatgttggt ttgccatcaa gcaagctgaa 120 aagcaaaaca agccagctgt cag 143 25 138 DNA Homo sapiens 25 gtggaatgca gagaaagcta tctgtggcct tggcctttgt cgggggatct aaggttgtca 60 ttctggatga acccacagct ggtgtggacc cttactcccg caggggaata tgggagctgc 120 tgctgaaata ccgacaag 138 26 221 DNA Homo sapiens 26 gccgcaccat tattctctct acacaccaca tggatgaagc ggacgtcctg ggggacagga 60 ttgccatcat ctcccatggg aagctgtgct gtgtgggctc ctccctgttt ctgaagaacc 120 agctgggaac aggctactac ctgaccttgg tcaagaaaga tgtggaatcc tccctcagtt 180 cctgcagaaa cagtagtagc actgtgtcat acctgaaaaa g 221 27 73 DNA Homo sapiens 27 gaggacagtg tttctcagag cagttctgat gctggcctgg gcagcgacca tgagagtgac 60 acgctgacca tcg 73 28 203 DNA Homo sapiens 28 atgtctctgc tatctccaac ctcatcagga agcatgtgtc tgaagcccgg ctggtggaag 60 acatagggca tgagctgacc tatgtgctgc catatgaagc tgctaaggag ggagcctttg 120 tggaactctt tcatgagatt gatgaccggc tctcagacct gggcatttct agttatggca 180 tctcagagac gaccctggaa gaa 203 29 49 DNA Homo sapiens 29 atattcctca aggtggccga agagagtggg gtggatgctg agacctcag 49 30 114 DNA Homo sapiens 30 atggtacctt gccagcaaga cgaaacaggc gggccttcgg ggacaagcag agctgtcttc 60 gcccgttcac tgaagatgat gctgctgatc caaatgattc tgacatagac ccag 114 31 149 DNA Homo sapiens 31 aatccagaga gacagacttg ctcagtggga tggatggcaa agggtcctac caggtgaaag 60 gctggaaact tacacagcaa cagtttgtgg cccttttgtg gaagagactg ctaattgcca 120 gacggagtcg gaaaggattt tttgctcag 149 32 125 DNA Homo sapiens 32 attgtcttgc cagctgtgtt tgtctgcatt gcccttgtgt tcagcctgat cgtgccaccc 60 tttggcaagt accccagcct ggaacttcag ccctggatgt acaacgaaca gtacacattt 120 gtcag 125 33 99 DNA Homo sapiens 33 caatgatgct cctgaggaca cgggaaccct ggaactctta aacgccctca ccaaagaccc 60 tggcttcggg acccgctgta tggaaggaaa cccaatccc 99 34 189 DNA Homo sapiens 34 agacacgccc tgccaggcag gggaggaaga gtggacactg ccccagttcc ccagaccatc 60 atggacctct tccagaatgg gaactggaca atgcagaacc cttcacctgc atgccagtgt 120 agcagcgaca aaatcaagaa gatgctgcct gtgtgtcccc caggggcagg ggggctgcct 180 cctccacaa 189 35 95 DNA Homo sapiens 35 agaaaacaaa acactgcaga tatccttcag gacctgacag gaagaaacat ttcggattat 60 ctggtgaaga cgtatgtgca gatcatagcc aaaag 95 36 33 DNA Homo sapiens 36 cttaaagaac aagatctggg tgaatgagtt tag 33 37 107 DNA Homo sapiens 37 gtatggcggc ttttccctgg gtgtcagtaa tactcaagca cttcctccga gtcaagaagt 60 taatgatgcc atcaaacaaa tgaagaaaca cctaaagctg gccaagg 107 38 75 DNA Homo sapiens 38 gacagttctg cagatcgatt tctcaacagc ttgggaagat ttatgacagg actggacacc 60 aaaaataatg tcaag 75 39 170 DNA Homo sapiens 39 gtgtggttca ataacaaggg ctggcatgca atcagctctt tcctgaatgt catcaacaat 60 gccattctcc gggccaacct gcaaaaggga gagaacccta gccattatgg aattactgct 120 ttcaatcatc ccctgaatct caccaagcag cagctctcag aggtggctct 170 40 178 DNA Homo sapiens 40 gatgaccaca tcagtggatg tccttgtgtc catctgtgtc atctttgcaa tgtccttcgt 60 cccagccagc tttgtcgtat tcctgatcca ggagcgggtc agcaaagcaa aacacctgca 120 gttcatcagt ggagtgaagc ctgtcatcta ctggctctct aattttgtct gggatatg 178 41 116 DNA Homo sapiens 41 tgcaattacg ttgtccctgc cacactggtc attatcatct tcatctgctt ccagcagaag 60 tcctatgtgt cctccaccaa tctgcctgtg ctagcccttc tacttttgct gtatgg 116 42 145 DNA Homo sapiens 42 gtggtcaatc acacctctca tgtacccagc ctcctttgtg ttcaagatcc ccagcacagc 60 ctatgtggtg ctcaccagcg tgaacctctt cattggcatt aatggcagcg tggccacctt 120 tgtgctggag ctgttcaccg acaat 145 43 130 DNA Homo sapiens 43 gggagaatcg ctttgtgtca ccattatctt gggacttggt gggacgaaac ctcttcgcca 60 tggccgtgga aggggtggtg ttcttcctca ttactgttct gatccagtac agattcttca 120 tcaggcccag 130 44 121 DNA Homo sapiens 44 acctgtaaat gcaaagctat ctcctctgaa tgatgaagat gaagatgtga ggcgggaaag 60 acagagaatt cttgatggtg gaggccagaa tgacatctta gaaatcaagg agttgacgaa 120 g 121 45 63 DNA Homo sapiens 45 atatatagaa ggaagcggaa gcctgctgtt gacaggattt gcgtgggcat tcctcctggt 60 gag 63 46 244 DNA Homo sapiens 46 gtttggagat ggttatacaa tagttgtacg aatagcaggg tccaacccgg acctgaagcc 60 tgtccaggat ttctttggac ttgcatttcc tggaagtgtt ctaaaagaga aacaccggaa 120 catgctacaa taccagcttc catcttcatt atcttctctg gccaggatat tcagcatcct 180 ctcccagagc aaaaagcgac tccacataga agactactct gtttctcaga caacacttga 240 ccaa 244 47 1237 DNA Homo sapiens 47 gaattcttca acagggaaaa cagctagctt gaaaacttgc tgaaaaacac aacttgtgtt 60 tatggcattt agtaccttca aataattggc tttgcagata ttggataccc cattaaatct 120 gacagtctca aatttttcat ctcttcaatc actagtcaag aaaaatataa aaacaacaaa 180 tacttccata tggagcattt ttcagagttt tctaacccag tcttattttt ctagtcagta 240 aacatttgta aaaatactgt ttcactaata cttactgtta actgtcttga gagaaaagaa 300 aaatatgaga gaactattgt ttggggaagt tcaagtgatc tttcaatatc attactaact 360 tcttccactt tttccaaaat ttgaatatta acgctaaagg tgtaagactt cagatttcaa 420 attaatcttt ctatattttt taaatttaca gaatattata taacccactg ctgaaaaaga 480 aaaaaatgat tgttttagaa gttaaagtca atattgattt taaatataag taatgaaggc 540 atatttccaa taactagtga tatggcatcg ttgcatttta cagtatcttc aaaaatacag 600 aatttataga ataatttctc ctcatttaat atttttcaaa atcaaagtta tggtttcctc 660 attttactaa aatcgtattc taattcttca ttatagtaaa tctatgagca actccttact 720 tcggttcctc tgatttcaag gccatatttt aaaaaatcaa aaggcactgt gaactatttt 780 gaagaaaaca caacatttta atacagattg aaaggacctc ttctgaagct agaaacaatc 840 tatagttata catcttcatt aatactgtgt taccttttaa aatagtaatt ttttacattt 900 tcctgtgtaa acctaattgt ggtagaaatt tttaccaact ctatactcaa tcaagcaaaa 960 tttctgtata ttccctgtgg aatgtaccta tgtgagtttc agaaattctc aaaatacgtg 1020 ttcaaaaatt tctgcttttg catctttggg acacctcaga aaacttatta acaactgtga 1080 atatgagaaa tacagaagaa aataataagc cctctataca taaatgccca gcacaattca 1140 ttgttaaaaa acaaccaaac ctcacactac tgtatttcat tatctgtact gaaagcaaat 1200 gctttgtgac tattaaatgt tgcacatcat tcattca 1237 48 3002 DNA Homo sapiens 48 acatggcaat ggcattcatt aggaatctag ctgggaaaat ccagtgtgta tgcttggaaa 60 tgagggatct ggggctggag agaaaggcat gggcatgcct tggagggact tgtgtgtcaa 120 gctgaggacc tttactttaa gctctagggg accaggcaag gggagatgta gatacgttac 180 tctgatgggg tggatgaatt gaagaaggat gaggcaagaa tgaaggcaga gaccagggag 240 gaggctctcc aagtggccaa ggcataaagc aagaaatgag gcctggtgac tgcttagtgg 300 cagagcagtg aaagagaggg aggcatcaaa gtgagtctcg atttctagct gggtgggtgg 360 tagcgatgtc cagtaggcca gtggctactg aggtctgcag tggaggaggg tggttgggct 420 ggagacagat gatgagggag tcatcagcct gtgggtggaa gaaaagggaa cctcttccaa 480 ctgttttctt tgcttcttcc ctctctttct cttttttttt ttttttggac agagtcttgc 540 tctgtcaccc aggctgaaat gcagtggcat gatcttggct caccacagcc tccgcctcct 600 gggttcaagc aattctcctg tctcagcctc cagagtagct gggattacag gcacatatca 660 ctgtgcccgg ctaatttttg tattttcagt ggagatggga tttcaccatg ttggtcgggc 720 tggaatgaac tcctgacctc aagtgatcca cctgcctcag cctcccaaag tgttgggatt 780 acaggcattg agccaccgcg cccggccttt cttccctctc ttaaagagtg tttatttaat 840 tccacaaaca tgagcttgtc accccctgta gcctggcatc tcctacacga ggtgatggct 900 gaggcttctg cttctgctgg ggtagctctg atctttctgc tttctctggc actgtctacc 960 catgttgcct caccccacag gtcccagggc acctctctcg ggcaagtctt ggaaccctct 1020 gacactgatt tgctctcttt tctgagctgc ttttagccac ccatcctcgg gacctgtttt 1080 ctctctgcct ccacccctgc gggcagtctt aggtctcctg cccctcacga gcaccccaga 1140 gaggccacgt gctcagtgat ctcagtgggc gcatctttct agtcttgcta ttctttttgg 1200 ccatgttgtt cagaaaccat actgggcagg gccgacttca ccctaaaggc tgcgtctctt 1260 cactctgctt ttgtttgttc caaataaagt ggcttcagaa ttgctaaccc tagcctctgt 1320 gaacttgtga ggtacaattt tgtgtctgtt atgttaacaa aaatacatac ataccttcct 1380 ggtgatggta taaattgcta ttctctattg gaaagcaatt tggaatgaaa atttaaagaa 1440 ccattttaaa atatgctatc ctgcgtacct ccattccacc cacccccagg gatgtagcct 1500 actgaaataa ttttaaagaa gtcaccatat gagagaaaat gttattgcta tattgttatt 1560 gtgagaaatt ggaaatagac taaatgttca gcactatagg aataattaat gaaattacat 1620 atactctata caatcattat gctgccattg aaataataaa tacaaaggcg caagggggga 1680 aaagcttata atgttagtga aactaagact gattttttta taaagcagca gttttcagac 1740 ccttggagac tccaattcgg tagaaccaga gcttcatctt ctctgtcgaa gctgtgacag 1800 gagttgcaaa tgcctctcct ttttgctgag tttgcagctg ctgtttttcc ggcagcacat 1860 ctgtgcaggc ctctgcctcg gcccctctgg atctgctgat tgagcagcgg attgatctgt 1920 ccttctcttt cgtgttgacc catgtgagga accaactggc aagggaacaa gaaatggaaa 1980 taggcctcct ttgcatcatg acctgtacat cctgcaattg gaaaagattg tactttagtt 2040 ggtttaacca gcagcattat ttttctaaac taagcagtaa gaaggaatta ggttttatgt 2100 gggatcaaca gactgggtct caaaagagga aggtgataga acacagtggg gagggggagg 2160 tgcactagaa acagagggcc tatgctttca ttctggcttt gctacttaat agctgtgtga 2220 cccaatctta gagacttaac ctctctgaac ttccattttc tcatgtataa aatgggaaat 2280 attaaaggat actcactggg ctggtggctt gtgcctgtaa tcccagcact tggggaggtt 2340 gaggtgggag gatcacttga gcccaggtgt tcaagaccag cccaggcaac atggcaagac 2400 tctgtctcta tgaaaaaatt aaaaattagc caggtgtggt ggtgtgcacc tgtagtctta 2460 gctacttggt aggctgagat gggaggatca cttgggcttg ggaggtcaag gctgcggtga 2520 gctgtgattc catcactgca ctccagcccg ggcggcagag cgagacactg aatccaaacg 2580 acaacaacaa caaaaggcaa aaaaataaaa gtgccctctt tatggagttg tgtaaggtga 2640 agcatataca ctattcaaca tagtaactat ataaaggaag tattgttgtt gttactgtag 2700 ttaataccat taagtgagat gtttcgtata gtggaaagca catggactct gaattcagac 2760 tggtctgact ttgagtctca gctccacatc tagtaatact atgaccaagc cctggttaaa 2820 atcatgtttt tttttcttca gccttagtct tctcacatat aaaataggga cactgtcatt 2880 tacctcagtt ttctgtgagg ataaaacaac gacagtgtat atgcaagtat tttgtaaatt 2940 ttgtagtgct cctcaagatt tagttggtgt ttactacttg tactttctca ctggaatggc 3000 ag 3002 49 397 DNA Homo sapiens 49 aattcgggtc caattaaatt tttgaaattt tatattaaaa attatattag tagggatggg 60 taagaggtgt tttggtctgg ttggttggtt agttgctatg actcagaatt gctaagaaaa 120 cagaaaagta agataagatc attgttttaa cctcttttcc tccacaaaat caataaataa 180 catatcccta aattactctt agaatttctc ttaaattgca gtgaaaaacc aaaatccttc 240 attcttggtt gaaggttgga aaactacgtt agagaggatt agagagagag gatgagcaat 300 cgtgtagtca gcccttgcct cctagtgtag gatttgtctc agccactgct tgttgtcctg 360 gctgccaacg ttctcatgaa ggctgttctt ctatcag 397 50 520 DNA Homo sapiens 50 gtaagtggaa tcccatcaca ccagcctggt cttggggagg tccagagcac ctattatatt 60 aggacaagag gtactttatt ttaactaaaa atttggtaga aatttcaaca acaacaaaaa 120 aactcaactt ggtgtcatga ttttggtgaa attggtacat gacttgctgg aaggtttttc 180 ataggtcata aaataacagt atcttttgat ttagcatttc tactcaaggg aattaattcc 240 aggaattttg gtggcaggca cctgtaatcc cagctactcg ggaggctgag gcaggagaat 300 tgcttgaacc caggaggcag aggttgcagt gagctaagat cgcatcattg cactcccgcc 360 tgggcaataa gagtgaaact ccatctcaaa aaaaaaaaaa gatacaaaaa tagaaaaagg 420 ggcttggtaa gggtagtagg gttttgggca attttttttt tttttttttt ttattgtatg 480 gttctaaagg aatggttgat tacctgtggt ttggttttag 520 51 1786 DNA Homo sapiens 51 gtaagttacc tgcaagccac tgtttttaac cagtttatac tgtgccagat gggggtgtat 60 atatgtgtgt gcatgtgcat gcatgtgtga atgatctgga aataagatgc cagatgtaag 120 ttgtcaacag ttgcagccac atgacagaca tagatatatg tgcacacact agtaaacctc 180 tttccttctc atccatggtt gccactttta tctttttatt tttatttttt tttttgagat 240 ggagtctcgc tctgacgccc aggctggagt gcagtggctc gatctcggct cactgcaacc 300 tttgcctccc gggttcaagc tattctcctg cctcagcctc cacagtagct gggactacag 360 gctcatgctg ccacgcccgg ctgacttttt gtattttagt agagacgagg tttcaccatg 420 ttacccaggc tagacttcaa ctcctgagct caggcaatcc accctccttg gcctcccaaa 480 gtgctgggat tacaggtgtg agccactgca cccagcccac cactttaatt ttttacactc 540 tacccttttg gtcaaaattt gctcaatctg caagcttaaa atgtgtcatg acaaacacat 600 gcaagcacat actcacacat agatgcagaa acagcgtcta aacttataaa agcacagttt 660 atgtaaatgt gtgcacttct tctccctagg tggtaaacca catttcaaaa caacccaaat 720 aaaactgaac aaagcttctt cctcttagac tttttagaaa atctttcagt gctgagtcac 780 taagctgcca agttctcatt gtgggaacta tgcctttgga tgtaatgatt tcttctaaga 840 caatgggcgg aggtgtagtt attgcagaca tctgaaatat gtaatgtttc ttccagattc 900 tggaaattct cttattctct gtggttggtg gtggtggtgg gatgtgtgtg tgtgtgtgtg 960 tgtgtgtgtg tgtgtgtgtg tagggatcag gatgcgggag gagctgggtt ctgcttgtat 1020 tggttctctg ttttgcattg aatagtgtgt ttccttgtat ggctatctat agcttttcaa 1080 ggtcaccaga aattatcctg tttttcacct tctaaacaat tagctggaat ttttcaaagg 1140 aagactttta caaagacccc taagctaagg tttactctag aaaggatgtc ttaagacagg 1200 gcacaggagt tcagaggcat taagagctgg tgcctgttgt catgtagtga gtatgtgcct 1260 acatggtaaa gctttgacgt gaacctcaag ttcagggtcc aaaatctgtg tgccttttta 1320 ctttgcacat ctgcattttc tattctagct tggaatctga aacattgaca agagctgcct 1380 gaaatgtatg tctgtggtgt gattagagtt acgataagca agtcaatagt gagatgacct 1440 tggagatgtt gaacttttgt gagagaatga gttgtttttt tgttttggtt tttagtactt 1500 taacataatc tacctttagt ttaagtatcg ctcacagtta cctagttact gaagcaagcc 1560 cccaaagaaa tttggtttgg caacactttg ttagcctcgt ttttctctct acattgcatt 1620 gctcgtgaag cattggatca tacgtacatt tcagagtcta gagggcctgt ccttctgtgg 1680 cccagatgtg gtgctccctc tagcatgcag gctcagaggc cttggcccat caccctggct 1740 cacgtgtgtc tttctttctc cccttgtcct tccttggggc ctccag 1786 52 1745 DNA Homo sapiens misc_feature (545)..(545) n is a, c, g, or t 52 gtaaggcagc ctcactcgct cttccctgcc aggaaactcc gaaatagctc aacacgggct 60 aagggaggag aagaagaaaa aaaatccaag cctctggtag agaaggggtc atacctgtca 120 tttcctgcaa tttcatccat ttatagttgg ggaaagtgag gcccagagag gggcagtgac 180 ttgcccaagg tcaacccagc cgggtagcag ctaagtagga tgagagtgca gggttcatgc 240 tttccagata accacatgct caactgtgcc atgctgtctc attggtagtg gttcatggca 300 gcatctgaaa gctatttatt ttcttagata tattgggtgg cgattcttcc taagtttcta 360 agaacaataa tcagaaggat atatattgtt gcaggttaga ctgtctggaa gcagaggctg 420 aaatagagtt tgatgtatgg gtatttatga gggctcaata cctatgaaga gatatggaag 480 atgcaggatt gggcagaggg aggagttgaa ctgtgatata gggccaaccc cgtggggcac 540 tctanagaat atgcagcttg ttggagttgt tnttcatcga gctgaaacat ccagcccttt 600 gtgctccccc aaggcctccc tcctgacacc acctacctca gccctctcaa tcaatcactg 660 gatgtgggct gccctgggaa ggtcgtgccc cagggcctac atggctctct gctgctgtga 720 caaacccaga gttgctgatg cctgaggccg tctactgaca gctgggcaac aaggcttccc 780 tgaatgggga ctctgggcag tgcagttttg tgtctgaacc atacattaat atatttatat 840 ccgaattttc tttctctgca agcatttcat ataaagacac atcaggtaaa aataaatgtt 900 tttgaagcaa aaggagtaca aagagataag aactaactaa tttaatacta gttaccatct 960 gttacaaata gttcctactg attgccaagg actgtttaaa cacatcacat gggcttcttc 1020 ttctatcctc actaaccctt ttaacagaca aggaaatgag gctcaggaag gtcaaggact 1080 ttattgaggt tccacagtag gatacagttc ttgctaaaag caacccctcc ctcatgctct 1140 gttatctaac tgcaagggga aggtcagtgg cagaggtagt ggtcccatgg ttggtgcata 1200 agagctgctc tgagacaact gcatgctggt gggtcctgca gacatgtacc catcagccgg 1260 agataggctc aaaatatcca caagagtttg gatgattgtg ggaatgcaga atccatggtg 1320 atcaagaggg aaagtcaagt tgcctggcca ttttccttgg cttttagaca gaaaagttac 1380 gtgggatatt atctcccaca gctcttctgt ggtgccacca gtcatagtcc ttatataagg 1440 agaaaccagt tgaaattacc tattgaagaa acaaagagca aactcgccca ctgaaatgcg 1500 tagaaagccc tggactctgt tgtattcata actctgccat tatttttctg cgtagttttg 1560 ggtaagtcac ttatcttctt taggatggta atgatcagtt gcctcatcag aaagatgaac 1620 agcattacgc ctctgcattg tctctaacat gagtaggaat aaaccctgtc ttttttctgt 1680 agatcataca agtgagtgct tgggattgtt gaggcagcac atttgatgtg tctcttcctt 1740 cccag 1745 53 1060 DNA Homo sapiens 53 gtgagtacct ctggcctttc ttcagtggct gtaggcattt gaccttcctt tggagtccct 60 gaataaaagc agcaagttga gaacagaaga tgattgtctt ttccaatggg acatgaacct 120 tagctctaga ttctaagctc tttaagggta agggcaagca ttgtgtttta ttaaattgtt 180 tacctttagt cttctcagtg aatcctggtt gaattgaatt gaatggaatt tttccgagag 240 ccagactgca tcttgaactg ggctggggat aaatggcatt gaggaatggc ttcaggcaac 300 agatgccatc tctgcccttt atctcccagc tctgttggct atgttaagct catgacaaag 360 ccaaggccac aaatagaact gaaaactctt gatgtcagag atgacctctc ttgtcttcct 420 tgtgtccagt atggtgtttt gcttgagtaa tgttttctga actaagcaca actgaggagc 480 aggtgcctca tcccacaaat tcctgacttg gacacttcct tccctcgtac agagcagggg 540 gatatcttgg agagtgtgtg agcccctaca agtgcaagtt gtcagatgtc cccaggtcac 600 ttatcaggaa agctaagagt gactcatagg atgctcctgt tgcctcagtc tgggcttcat 660 aggcatcagc agccccaaac aggcacctct gatcctgagc catccttggc tgagcaggga 720 gcctcagaag actgtgggta tgcgcatgtg tgtgggggaa caggattgct gagccttggg 780 gcatctttgg aaacataaag ttttaaaagt tttatgcttc actgtatatg catttctgaa 840 atgtttgtat ataatgagtg gttacaaatg gaatcatttt atatgttact tggtagccca 900 ccactcccct aaagggactc tataggtaaa tactacttct gcaccttatg attgatccat 960 tttgcaaatt caaatttctc caggtataat ttacactaga agagatagaa aaatgagact 1020 gaccaggaaa tggataggtg actttgcctg tttctcacag 1060 54 1104 DNA Homo sapiens 54 gtacactgct ttgggcatct gtttggaaaa tatgacttct agctgatgtc ctttctttgt 60 gctagaatct ctgcagtgca tgggcttccc tgggaagtgg tttgggctat agatctatag 120 taaacagata gtccaaggac aggcagctga tgctgaaagt acaattgtca ctacttgtac 180 agcacttgtt tcttgaaaac tgtgtgccag gcagcatgca aaatgtttta tacacattgc 240 ttcatttaat tctcacaagg ctactctgaa gtagttacta taataaccag caattttcaa 300 atgagagaac tgtgactcaa agacgttaag taaccagctt tggtcacaca actgttaaat 360 gttggtacgt ggaggtgaat ccacttcggt tacactgggt caataagccc aggcgaatcc 420 tcccaatgct cacccaattc tgtatttctg tgtcctcaga gggggtacaa ctaggagagg 480 ttctgtttcc tgagtacagg ttgttaataa ttaaatatac tagctctaag gcctgcctgt 540 gatttaatta gcattcaata aaaattcatg ttgaattttt ctttagtact tctttcttaa 600 tataatacat cttcttgacc aagtccaaga ggaacctgcg ttggacagtt ttcatatgag 660 atcaaattct gagagagcaa gatttaaccc tttttggttc accttctgat cctcccctaa 720 ggaggtatac atgaaatatt tattactcct gcctgaactt ctttcattga atatgcaatt 780 ttgcagcatg cagattctgg atttaaattc tgagtcttaa cttactggct gagggacctt 840 ggataggctc cttatccctc agtttcctca tctctaaaat ggggatggca cctgccccgt 900 gggttgttgg aaggacttac agaggtgcag aatgtacgtt gtacatagca ggtttcagca 960 aatgttagct ccctctttcc ccacatccat tcaaatctgt tccttctcca aaggatgtgt 1020 caaggaggaa atggacctgg ctgggaaacc ctcagaatac tgggatgatg ctgagcttgg 1080 ctcatacctg tgctttgctt tcag 1104 55 1180 DNA Homo sapiens 55 gtaagtgctg ttgacctcct gctctttctt taacctagtg ctgctgcctc tgctaactgt 60 tgggggcaag cgatgtctcc tgcctttcta aaagactgtg aaaccactcc aggggcagag 120 aaatcacatg cagtgtccct ttccaaatcc tcccatgcca tttatgtcca atgctgttga 180 cctattggga gttcacggtc tcgatccctg agggacattt tctttgttgt cttggcttct 240 agaagagtat cttttacttg ccccctccca aacacacatt tcatggtctc ctaacaagct 300 agaagaaaga ggtaaagaca agcgtgattg tggaaccata gcctcgctgc ctgcctgtga 360 catggtgacc tgtgtatcag cctgtgtggg ctgagaccaa gtggctacca cagagctcag 420 cctatgcttc ataatgtaat cattacccag atccctaatc ctctcttggc tcttaactgc 480 agacagagat gtccacagct catcaaaggc tctgccttct gggttctttg tgcttagagt 540 ggcttcctaa atatttaata ggtccctttt ctgccagtct cttctgtgcc catcccctga 600 ttgcccttgg taaaagtatg atgcccctta gtgtagcacg cttgcctgct gttcctaatc 660 atcttctcct acctcctctt tacacctagc tcctgtttca gtcacctaga aatgctcaca 720 gtcgctggaa tatgtcatgt tcttccacac ctccatgcct ttgtaggtac tgtttgctct 780 cacaggagaa ctttctctct aacttgccta tcttctcaac tcctcctttc tctccaagat 840 ctagttccgg atcccctccc ctgagcatcc ctccttggtt ctcaggtagt cagtcactct 900 ctgccctgaa cttccatggc acgtgaaaga aaatcttttt attttaaaac aattacagac 960 tcacaagaag taatacaaat tacatgaggg ggttccctta aacctttcat ccagtttccc 1020 caatggtagc agcatgtgta actgtagaat agtatcaaaa ccatgaaatt gacataggta 1080 caattcacaa accttcttca gatttcacta gctttatgtg cgctcatttg tgtgtgtgtg 1140 tgcgtattta gttctatgca attttatcat gtgtgaattc 1180 56 903 DNA Homo sapiens 56 gatccctggg ccaagggaag gagcacatga ggagttgccg aatgtgaaca tgttatctaa 60 tcatgagtgt ctttccacgt gctagtttgc tagatgttat ttcttcagcc taaaacaagc 120 tggggcctca gatgaccttt cccatgtagt tcacagaatt ctgcagtggt cttggaacct 180 gcagccacga aaagatagat tacatatgtt ggagggagtt ggtaattccc aggaactctg 240 tctctaagca gatgtgagaa gcacctgtga gacgcaatca agctgggcag ctggcttgat 300 tgccttccct gcgacctcaa ggaccttaca gtgggtagta tcaggagggg tcaggggctg 360 taaagcacca gcgttagcct cagtggcttc cagcacgatt cctcaaccat tctaaccatt 420 ccaaagggta tatctttggg gggtgacatt cttttcctgt tttcttttta atcttttttt 480 aaaacataga attaatatat tatgagcttt tcagaagatt tttaaaaggc agtcagaaat 540 cctactacct aacacaaaaa ttgtttttat ctttgaataa tatgttcttg tttgtccatt 600 ttccatgcat gcgatgttag gcatacaaaa tacatttttt aaagaatact ttcattgcaa 660 attggaaact tcgtttaaaa aatgctcata ctaaaattgg catttctaac ccataggccc 720 acttgtagtt atttaccgaa gcaaaaggac agctttgctt tgtgtgggtc tggtagggtt 780 cattagaaag gaatgggggc ggtgggaggg ttggtgttct gttctctctg cagactgaat 840 ggagcatcta gagttaaggg taggtcaacc ctgacttctg tacttctaaa tttttgtcct 900 cag 903 57 486 DNA Homo sapiens misc_feature (97)..(97) n is a, c, g, or t 57 gtgagtacca gcagcacgtt aagaataggc cttttctgga tgtgtgtgtg tcatgccatc 60 atgggaggag tgggacttaa gcattttact ttgctgngtt tttgtttttt ctttttttct 120 tttttatttt tttgagatgg agtctcgctc tgtagccagg ctggactgta gtggcgcgat 180 ctcggctcac tgcaaccttg gcctcccagg ttcaagcgat tctcctgcct cagcctcccg 240 agtagctggg actctaggca cacaccacca tgcccagcta atttttgtgt ttttagtaga 300 gacggggttt caccatgttg gccaggatgg tctcaatgtc ttgacctcgt gatccgccca 360 cctcggtctc ccaaagtgct gggaacacag gcatgagcca ctgtgtctgg ccacatttta 420 ctttctttga atatggcagg ctcacctccg tgaacacctt gagacctagt tgttctttga 480 ttttag 486 58 283 DNA Homo sapiens 58 gaaattgaaa gttgtaactg cctggtgcat ggtggccagg cctgctggaa acaggttgga 60 agcgatctgt cacctttcac tttgatttcc tgagcagctc atgtggttgc tcactgttgt 120 tctaccttga atcttgaaga ttatttttca gaaattgata aagttatttt aaaaagcacg 180 gggagagaaa aatatgccca ttctcatctg ttctgggcca ggggacactg tattctgggg 240 tatccagtag ggcccagagc ttgacctgcc tccctgtccc cag 283 59 203 DNA Homo sapiens 59 gtgcggccca gagctacctt ccctatccct ctcccctcct cctccggcta cacacatgcg 60 gaggaaaatc agcactgccc cagggtccca ggctgggtgc ggttggtaac agaaacttgt 120 ccctggctgt gcccctaggt cctctgcctt cactcactgt ctggggctgg tcctggagtt 180 tgtcttgctc tgtttttttg tag 203 60 702 DNA Homo sapiens 60 gtgcctgatg tgtatttatt ctgagtaaat ggactgagag agagcggggg gcttttgaga 60 agtgtggctg tatctcatgg ctaggcttct gtgaagccat gggatactct tctgttatca 120 cagaagagat aaagggcatt gagactgaga ttcctgagag gagatgctgt gtctttattc 180 atctttttgt ccccaacatg gtgcactaaa tttatggtta gttgaaaggg tggatgctta 240 aatgaatgga agcggagagg ggcaggaaga cgattgggct ctctggttag agatctgatg 300 tggtacagta tgaggagcac aggcaggctt ggagccaact ctggctggcc ctgagacatt 360 gggaaagtca caacttgcct caccttcttt gccgataata atagtggtgc ttacctcata 420 gaggattaaa ttaaatgaga atgcacacaa accacctagc acaatgcctg gcatatagca 480 agttcccaaa taaaatgcta ctgttcttac ctctgtgagg atgtggtacc tatatataca 540 aagctttgcc attctagggg tcatagccat acagggtgaa aggtggcttc caggtctctt 600 ccagtgctta cccctgctaa tatctctcta gtccctgtca ctgtgacaaa tcagaactga 660 gaggcctcac ctgtcccaca tccttgtgtt tgtgcctggc ag 702 61 1258 DNA Homo sapiens 61 gtgagctgca gtcttggtgc tgggctggtg ttgggtctgg gcagccagga cttgctggct 60 gtgaatgatt tctccatctc cacccctttt gccatgttga aaccaccatc tccctgctct 120 gttgcccctt tgaaatcata tcatacttaa ggcatggaaa gctaaggggc cctctgctcc 180 cattgtgcta gttctgttga atcccgtttt ccttttccta tgaggcacag agagtgatgg 240 agaaggtcct tagaggacat tattatgtca aagaaaagag acttgtcaag aggtaagagc 300 cttggctaca aatgacctgg tgttcctgct cattactttt caatctcatt gaccttaact 360 tttaaactat aaaacagcca atatttatta ggcactgatt tcatgccaga gacactctgg 420 gcatgaaaga aagtaatgat aatagttaat tttatatagc gttgttacca tttacaacct 480 tttttttttt tttaacctct atcatctcaa ttaaagtgca gagagaccct gggaagaagg 540 taactatatt tattatccca gatgagggaa gtgaggcttg tagggaattg gtagctgatt 600 caaggtcacc cagcaggtaa ataacagtgg tgggaccaga cccaattacc aggtatgttt 660 tcctctgtac cgcagtacat gcctgagatt tatttgtgtg ttgaagccag tggtacctaa 720 tgtatttaca tcccaacctg aaactcctat ccacttattt accttttaat gagcctctta 780 actcaagtgc agtctgagga ccagcagcat caggatcact tgggaacttg ttagaaattc 840 agcaacctgg gcccagctca gacctaccga atcagaatct gtgcatttta acaaggttct 900 tgagtggttg aacacacatt aaagcatgag aagcattgaa ctagacatgt agccaggtaa 960 aggccttgcc tgagatggtt ggcaaaggcc tcattgcagc attcattggc aggccacagt 1020 tcttttggca gctctgcttc ctgacctttc accctcagga agcgaggctg ttcacacggc 1080 acacacatgc cagacagggt cctctgaagc cacggctgcc agtgcatgtg tcccagggaa 1140 agctttttcc tttagttctc acacaacaga gcttcttgga agccctcccc ggcaaaggtg 1200 ctggtggctc tgccttgctc cgtccctgac ccgttctcac ctccttcttt gccatcag 1258 62 986 DNA Homo sapiens misc_feature (502)..(502) n is a, c, g, or t 62 gtaaggactc tggggtttct tattcaggtg gtgcctgagc ttcccccagc tgggcagagt 60 ggaggcagag gaggagaggt gcagaggctg gtggcgctga ctcaaggttt gctgctgggc 120 tggggctggg tggctgcggg tgtgggagca gcttggtggc gggttggcct aatgcttgct 180 ggggtgcctg gggctcggtt tgggagctag cagggcagtg tcccagagag ctgagatgat 240 tggggtttgg ggaatccctt aggggagtgg acactgaata ccagggatga ggagctgagg 300 gccaagccag gagggtggga tttgagctta gtacataaga agagtgagag cccaggagat 360 gaggaacagc cttccagatt tttcttgggt agcgtgtgta ggaggccagt gtcaccagta 420 gcatatgtgg aacagaagtc ttgacccttg ctatctctgc ctagtcctaa tggctggctt 480 ttcccaggaa ggcttctgct tncatggacn gntagattaa ccctttattt aggtaaatga 540 gggaacctac tttataagca taggaaaggg tgaagaatct tttaagattc ctttactcaa 600 gttttctttt gaagaatccc agagcttagg caatagacac cagactttga gcctcagtta 660 tccattcacc catccaccca cccacccacc catccttcca tcctcccatc ctcccattca 720 cccatccacc catccagctg tccacccatt ctacactgag tacctataat gtgcctggct 780 ttggtgatac aaaggtgaat aagacatagt cctttccttt gcccccaacc ctcagaccag 840 agatgaacat gtggaatgac ctaaacacct ggaacaggtg tggtgtatga gcggcaggcc 900 tctgatgaga gggtggggga tggccagccc tcactccgaa gcccctctga gttgattgag 960 ccatctttgc attctggtcc ctgcag 986 63 1667 DNA Homo sapiens 63 gtaagttaag tggctgactg tcggaatata tagcaaggcc aaatgtccta aggccagacc 60 agtagcctgc attgggagca ggattatcat ggagttagtc attgagtttt taggtcatcg 120 acatctgatt aatgttggcc ccagtgagcc atttaagatg gtagtgggag atagcaggaa 180 agaagtgttt tcctctgtac cacagtacat gcctgagatt tgtgtgttga aaccagtggt 240 acctaacaca tttacatccc aaccttaaac tcctatgcac ttatttaccc tttaatgagc 300 ctctttactt aagtacagtg tgaggaacag cggcatcagg atcacttggg aacttgttag 360 aaatacagca acttgggccc agctcagacc tactgaatca gaatcaggag caattctctg 420 gtgtgactgt gtcacagcca ggtatcaact ggattctcat acataggaaa tgacaaacgt 480 ttatggatgg atagtctact tgtgccaggt gctgagattt gttttttgtt ttttgatttt 540 tttttaatca ctgtgacctc atttaattct caaaaaaaga tgaaaaaatg aacactcagg 600 aatgctgaca tgagattcag aatcaggggt ttggggcttc aaagtccatc ctctctttat 660 ccatgtaatg cctcccctta gagatacaac atcacagacc ttgaaggctg aaggggatat 720 aaaagctgtc tggccaagtg gtctccaagc ttgacagtgc agcagaatca cctggggata 780 ttattaaaaa taaacatact aaggtttggc ttcagggcct gtgaatcaga atttctggag 840 gtgaggcctt gaagtctgta tttctattgc atactttgga cacagtggtc tatagactag 900 agtttggaaa tgattgcgct cattcagatt ctcttctgat gtttgaattg ctgccatcat 960 atttctagtg ctctatttcc tcctgctcat tctgtcttgg ataacttatc atagtactag 1020 cctactcaaa gatttagagc cacagtcctg aaagaagcca cttgactcat tccctgtagg 1080 ttcagaataa atttcttctg cgcagtgtct gtcatagctt tttttaaatt tttttttatt 1140 tttgatgaga ctggagtttt gctcttattg cccaagctgg agtgcagtgg tgcgattttg 1200 gctcactgca acctccacct cccaggttca agcgattctc ctgcctcagc ctcccaagta 1260 gctgagatta caagcatgtg ctaccacgcc cagctaattt tgtattttta gtagagatgg 1320 gttttatcca tgttggtcag gctggtctcg agctccagac ctcaggtgat ctgcccgcct 1380 cggcctccca aagtgctggg attataggcc tgagccacag cgctcagcca taactttaat 1440 ttgaaaatga ttgtctagct tgatagctct caccactgag gaaatgttct ctggcaaaaa 1500 cggcttctct cccaggtaac tctgagaaag tgttattaag aaatgtggct tctactttct 1560 ctgtcttacg gggctaacat gccactcagt aatataataa tcgtggcagt ggtgactact 1620 ctcgtaatgt tggtgcttat aatgttctca tctctctcat tttccag 1667 64 195 DNA Homo sapiens 64 gtaactgcct tgagggagaa tggcacactt aagatagtgc cttctgctgg ctttctcagt 60 gcacgagtat tgttcctttc cctttgaatt gttctattgc attctcattt gtagagtgta 120 ggtttgttgc agatggggaa ggtttgtttt gttgtaaata aaataaagta tgggattctt 180 tccttgtgcc ttcag 195 65 284 DNA Homo sapiens misc_feature (90)..(90) n is a, c, g, or t 65 gtctgttagg gcaagatcaa acagtgtcct actgtttgaa tgtgaaattc tctctcatgc 60 tctcacctgt tttctttgga tggcctttan ccaaggtgat agatccctac agagtccaaa 120 gagaagtgag gaaatggtta aagccacttg ttttttgcag catcgngcat gtnatcaaac 180 ctganagagc ctatccatat cactttnctt taanagacat taaanatggn tccttaatct 240 cttttgancc cattgtattt attattcttt ttctgcgggg gtcc 284 66 560 DNA Homo sapiens 66 tctagaaaat ttttaggaac agaaaacttt ccagttctct cacccctgct caaagagtgt 60 atggctctta cattatatat aactgcctga cttcatacag tatcagtact tagatcattt 120 gaaatgtgtc cacgttttac caaaatataa tagggtgaga agctgagatg ctaattgcca 180 ttgtgtattc tcaaatatgt caagctacgt acatggcctg tttcatagag tagtctataa 240 gaaattgatg acttgattca tccgaatggc tggctgtaac acctggttac gcatgaacac 300 ctcttttcag ttgtctcaag acacctttct tttctgtact tatcagacaa ggactgaaag 360 gcagagactg ctactgttag acattttgag tcaagctttt ccttggacat agctttgtca 420 tgaaagccct ttacttctga gaaacttcta gcttcagaca catgccttca agatagttgt 480 tgaagacacc agaagaagga gcatggcaat gccgaaaaca cctaagataa taggtgacct 540 tcagtgttgg cttcttgcag 560 67 1649 DNA Homo sapiens 67 gtgagacgtg ctgttttcgc cagagactct ggcttcatgg gtgggctgca ggctctgtga 60 ccagtgaagg caggatagca tcctggtcaa gatatggatg ccggagccag atttatctgt 120 atttcaatcc cagttctatt ccttgccagt tgtgtatccg ctggcaagtt acttctctat 180 gcctcaatct cctcatctgt aaaatgggga taataatatt acctgcaata cagggttgtt 240 acgaaaataa aaatgaatag gtgcttagaa tggggcctga cattagtaag tgcttagttt 300 tgtgtgtgta tatgttattt ttattttgga ggagaacata aaaaggacaa agtgtagaaa 360 aactggttgg gtgtattcag ctgtcataac atgagagttg ttatgcccag atgcacttga 420 catgtgaatt tattagaaac atgatttttc tctgagttga tgtttaactc aaactgatag 480 aaaagatagg tcagaatata gttggccaac agagaagact tgttagacta ttgtctgcat 540 gtcagtgttt gcatgctaac ttgcttagtt agaaaggtta aattttttca ctctataaaa 600 tcaagaaata tagagaaaag gtctgcagag agtctttcat ttgatgatgt ggatattgtt 660 aagagcggga gtttggagca tacagagctc aagttgaatc ctgactttgc tacttattgg 720 ctatatgacc ttgggcaagc tgcttagtct ctctgatcct cagttacctt tgtttgttga 780 tgatgaccat tgataacaca accataaata atgacaacat agagatagtt ctcattatag 840 tagttgttat acagaattat tcactcaatg ttaattttct gcattgaaat cccagaacat 900 tagaattggg ggcattattt gaatctttaa ggttataagg aatacatttc tcagcaataa 960 atggaaggag ttttgggtta acttataaag tatacccaag tcattttttt ttcagagaag 1020 atatggtaga aagtcttagg aggttgaaga aggaattgga tatttattct ttctgagact 1080 atcatgggag ataatgacta tggttgtcca tgattggagc cgttgctgta gagttggttt 1140 tattatagtg taggatttga atgggccatg tgttctcaga cctcagatta aaatgagaaa 1200 actgaggcca gtggggagcg tgacttcaca tgggtacact tgtgctagag acagaaccag 1260 gattcaggac ttctggctcc tggtcctggg ttcatggccc aatgtagtct ttctcagtct 1320 tcaggaggag gaagggcagg acccagtgtt ctgagtcacc ctgaatgtga gcactattta 1380 cttcgtgaac ttcttggctt agtgcctctg ccaggtggcc ataacctctg gccttgtgtt 1440 gccagagaaa aggtttagtt ttcaggctcc attgcttccc agctgccaag aatgccttgg 1500 tgcagcacag tcataggccc tgcattcctc attgccgtgc tggttggtcg gggaggtggg 1560 ctggactcgt agggatttgc cccttggcct tgtttctaac acttgccgtt tcctgctgtc 1620 cccctgcccc ctccactgcc tgggtaaag 1649 68 1230 DNA Homo sapiens 68 gtatgtttgt cttctacatc ccaggagggg gtaagattcg agcagaccaa agatgtttac 60 gagggccaag ggaatggact tcagaattac acggtggaat gaattttact gctgcggctc 120 aggtccctgt ataagctaat actgcatgca tagaacagca gcgaactaac cctgaataat 180 aggccagtct tctgttgagc ctttcagcct ctctcctctt catcctactg ttgtcaggaa 240 cagccacatg tgttttaggt gaaataatcc acccttgcaa aaatccatga ttaagttata 300 aaatatttgg atttgtggag ctgtgtttta attctgtaac tgagtcacag ggcacactgt 360 caaagcatag aacctccaga gacttgtttt ctgcaaagta taattcatgt aattattatc 420 tattctgtta tatttgggat gttaggtagt gtttgttctt tagataaaaa tatcccccac 480 tctgtaacaa tacattaaat caaagaaaag gacaaaggat ttttctgggt cttgttagca 540 ggagctttct tcagtcctga aagatttgta gacctgtaga tgggggaact gtgtcagtga 600 tacaaaaggg aagcatttaa aaaaaaaaag tatatatata tatatatata tatgtaatgt 660 gaattggcct ctttttctct aagcccacat tttcttctta catagttcag gtttacttta 720 ttttttcctt tccggctgct gaccctgtat tgcccgtagt tgtggaacat agcatgtgtt 780 tgtgacctgt gcctgttatt tttgtgcttt ctagttgtgc atgcaaagag tacaaagttt 840 tcttgccctt tcttggaaaa tcctgcttgt ctgtgccaaa gggataattg tgaaagcact 900 tttgaaatac ttaatgagtt gattttcttc aaattaaaaa aaatatataa atgtatctgt 960 gtatgtacat gtgtgtacac atacacacct ttatacatac agcccattta aaacaagctc 1020 cactttggag tgctctacgt caccctgatg ccgaatacag ggccagagtc tgagatcctt 1080 ctgggtggtt tctgtgtttt gttcatttct gttttaagag cctgtcacag agaaatgctt 1140 cctaaaatgt ttaatttata aaaacatttt tatctctcga ttactggttt taatgaatta 1200 ctaagctggc tgcctctcat gtacccacag 1230 69 3035 DNA Homo sapiens 69 gtgagtgcca ctttagccat aagcaggctt cttgtgcttg ttgcctggtt tgatttctaa 60 tatgctgcat ttatcaactg catgccacat tgtgaccgcc agcatttgcc ctttgaatta 120 ttattatgtt ttatttacaa aaagcgaagg tagtaaccga actaaattat ctaggaacaa 180 acgtttggag agtcttctaa caccgtgcaa agcacgtcat tacagacatt tgtttactga 240 tttagaacct taatatttaa tttaaatagc actttacact tactgatgaa atgcttttcc 300 tttctttctc tcccagcccc tgtacttaag tgcttcaata ggctctcatt atatatgatt 360 tttaggtttt gcttatcagc ttcttcgctt ttataatctg aaaagatggc atatgaattt 420 ttataaaaag ggacactttc ttcttctcaa attgtatatt tttattgtac tttccttcaa 480 aacccccttt taaaaagtaa gcagtggata aataaattca gtgaagcatc catatgaccc 540 ttaagtgagt gtaggggaag ggaggtcacc agatcactgt gagtgaagat ggtggagagg 600 tgaggatctt atgaggccgt gctcaaggct ggtagaggtg ggttagtgtt tccaggttta 660 ggcagaatct cagctgaggt catgaaacaa cagtgatctc tgaaaaatta tggcaaggtg 720 ggaaggtgct ggagaattgg agagggggca aacttgactt tcaagtttca atgggaagat 780 aggtgactct gcacaccaca gaacagtgag catgataacc tgtttataca aggttctaga 840 gcagatttct aaatggatag ctactgtgtg cttgtttgtt cttaattagt attggatagt 900 tactaaatac ttgttagtac ttagtacata atgggtggta aatcctagca gctaatattg 960 gttcccaaat aaccagatga caaggataga gaaggacaca gacacggcct atctggattt 1020 catggtgcct ttcattttcc acatgaaggt tgtgtaggga agatagaagc atgagatgag 1080 atgataatat agttatctgg attcatcact ggccagctga accatatgaa ctcatggatt 1140 gatgctagct taggaaggct ctgtaggagc cagaactggg ctgagagcca gcccatagag 1200 acaaaagagg cccggccctg acatcagagg gttcaaacat gatgtctgag ccccacctac 1260 agtctgccgg aggtggttgg aaggaagagc ctttatcctt acaattctta ctgaaattca 1320 aatttttagg ttttgcaaaa aaatggtgga cctgaaggaa atttgacagg agcatgtctc 1380 agctgtattt aaatttgtct cagccaatcc ccttttgaat gttcagagtg taagcttcag 1440 gagggcagcg cgtcttagtg tgacttttct ggtcagttca ggtgctttaa ggagacaatt 1500 agagatcaat ctggaaaact tcatttgaat ttttaataca taagaaaaca ataagaaata 1560 gttaaaaata tatatttata atatatatat gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt 1620 atatatatat attttattta tttatttttt tttgagatgg agtctcgctc tgttgcccag 1680 gctggagtgc agtggctcaa tcttggctca ctgccacctc tgcctcccag gttcaagtga 1740 ttctcctacc tcagcctcct gagtagctgg gattacaagc atgtgccacc acactggcta 1800 atttttctaa ttttagtaga gatggagttt caccatgttg gacaggatgg tcttgaactc 1860 ctgacttagt gatccacccg ccttcgcctc ccaaagttct gggattacag gcatgagcca 1920 tcgtgcctgg caattatatt taatatttaa taataaggaa ataattgctg taactttact 1980 ttaaattgtg gaattctgaa actggaaggg aactggaaat gacttgttga atcaaatcat 2040 tttaaacttt tattttgcca gtggaaaaaa taagccccca aaagagcagg ggacctgctg 2100 atgtcccaca gtaattcaga gctggagatg aggttgaagg ctttgtgtct tatctccagg 2160 gaaaatttgt agacagcgta gctctttatg tgacgagcat tctcacccca gtcatccccc 2220 aattctctac tcatttgaga acataaattg gatcttgcca gtctctactc atttttcagc 2280 acatcgagca taagatccag actctttccc aggcctctct catctggctc ctctcctcct 2340 cctttatcat tactcttctt cgtagcttat cctactccag ccatgctgtc ttcctattat 2400 tcctaaaaag tagaaatgca tttcttccta gggcctttgt acctgcactt gccatcgctt 2460 ttgctcagaa tgttcttttt gccaagcttt tgcccagctt gttctccatc attgttatgt 2520 tttggctgaa atgtcttctc ttagtaggtt cattctcccc agtcactgtc tttttatttt 2580 gctttatttt gggccatcta aggttatctt attagtgtat ttgttgttcg tctcctccat 2640 gggcatacac ctccatgaag gcaggtattt tcaccttagg ccctcgaata tactggacag 2700 catctggcac gtagtagatg ctcaacgaat gtttgttgtg tgagcaaatg gttggttgat 2760 tggattgaac tgagttcagt atgtaaatat ttagggcctc tttgcattct attttactta 2820 tgtataaaat gatacataat gatgatataa atgatgtcac agtgtacaag gctgttgtgg 2880 gatcaagcaa tcaaatgaga tcatgcttgt cttttccaaa tggtgaggga atagatgcat 2940 gtttgtggtt gttacggaat gatcctgtgc tcctgaggca acagaaaggc caggccatct 3000 ctggtaatcc tactcttgct gtcttccctt tgcag 3035 70 342 DNA Homo sapiens 70 gtgagtcact ttcagggggt gattgggcag aaggggtgca ggatgggctg gtagcttccg 60 cttggaagca ggaatgagtg agatatcatg ttgggagggt ctgtttcagt cttttttgtt 120 ttttgttttt ttttctgagg cggagtcttg ctctgtcgcc caggctggag tgctgtggca 180 tgatcttgcc tcactgcaac ctccacctcc caggttcaag cgattctcct gcctcagcct 240 cctgagtagc tgggattaca ggcacgcacc accatgtctg gctaattttt gtgtttttag 300 tagagatagg gtttcgccgt gttggctagg ctggtctgga at 342 71 1182 DNA Homo sapiens 71 gaattcctga cctcaggtga tccacccgcc tcggcctccc aaagtgctgg gattacaggc 60 gtgagccact acgcccagcc ctgtttcagt ctttaactcg cttcttgtca taagaaaaag 120 catgtgagtt ttgaggggag aaggtttgga ccacactgtg cccatgcctg tcccacagca 180 gtaaagtcac aggacagact gtggcaggcc tggcttccaa tcttggctct gcaacaaatg 240 agctggtagc ctttgacagg cctgggcctg tttcttcacc tctgaattag ggaggctgga 300 ccagaaaact cctgtggatc ttgtcaactc tggtattctt agagactctg tttgggaagg 360 agtcctgagc catttttttt ttcttgagaa tttcaggaag aggagtgctt atgatagctc 420 tctgctgctt ttatcagcaa ccaaattgca ggatgaggac aagcaattct aaatgagtac 480 aggaactaaa agaaggcttg gttaccactc ttgaaaataa tagctagtcc aggtgcgggg 540 tggctcacac ctgtaatctc agtattttgg gatgccgagg tggactgatc acctaaggtc 600 aggagttcga aaccagcttg gccaatgtgg cgaaaccctg tctctactaa aaattcaaaa 660 attagccagg catggtggca catgcctgta atcccagtta cttgggaggc tgaagcagga 720 gaattgcttg aacctgggag gtggaggtcg cagggagcca aaattgcgcc actgtactcc 780 agcctgagca acacagcaaa actccatatc aaaaaataaa atgaataaaa taacagctaa 840 tctagtcatc agtataactc cagtgaacag aagatttatt aggcatagtg aatgatggtg 900 cttcctaaaa atctcttgac tacaaagaat ctcatttcaa tgtttattgt ttagatgttc 960 agaataaatt cttgggaaag accttggctt ggtgtaagtg aattaccagt gccgagggca 1020 gggtgaacca agtctcagtg ctggttgact gagggcagtg tctgggacct gtagtcaggt 1080 ttccggtcac actgtggaca tggtcactgt tgtccttgat ttgttttctg tttcaattct 1140 tgtctataaa gacccgtatg cttggttttc atgtgatgac ag 1182 72 1309 DNA Homo sapiens 72 gtgacttttt actaaacttg gcccctgccg tattattact aattagagga attaaagacc 60 tacaaataac agactgaaac agtgggggaa atgccagatt atggcctgat tctgtctatt 120 ggaagtttag gatattatcc caaactagaa aagatgacga gagggactgt gaacattcag 180 ttgtcagctt caaggctgag gcagcctggt ctagaatgaa aatagaaatg gattcaacgt 240 caaattttgc cacttagtag caacttgacc aggtaactgg ttatcctttt aaagccttag 300 tttatctaaa ttgtgatatt aatgttgctc ttataagttt gtcatgagga ctaaattaaa 360 tggtgtacat agagtgcctt gggtactctc tgatggggga ctccatgata atttgtggtc 420 tcatggaggg agctctggga aggtttagga gcctgccttg gctctgcagc cttgggagag 480 ccttctagct tcccaggaca tggcagccta gtgttgaatg cttggctcag caaatgtttg 540 ttctcgtttc cttcccatca acttggtcag ttggggtctt tcagttagga gtatctcagt 600 gactttaaat ggcatgggca tgctggagtg atagtgacca tgagtttcta agaaagaagc 660 ataatttctc catatgtcat ccacaattga aatattattg ttaattgaaa aagcttctag 720 gccaggcacg gtggctcatg cctgtaatcc cagcacttta ggaggccaag gcgggtggat 780 cacttgaggt caggagtttg agaccagcct ggccaacatg gggaaaccct gtctctacta 840 aaaatacaaa ataagctggg cgtggtggtg cgtgcctgta atcccagcta cttgggaggc 900 tgaggcagga gaattgcttg aatctgggag gcggaggttg cagtgagctg agttcatgcc 960 attgcattcc agcctgggca acaagagcga aaccatctcc caaaagaaaa aaaaaagaaa 1020 gaaaaagctt ctagtttggt tacatcttgg tctataaggt ggtttgtaaa ttggtttaac 1080 ccaaggcctg gttctcatat aagtaatagg gtatttatga tggagagaag gctggaagag 1140 gcctgaacac aggcttcttt tctctagcac aaccctacaa ggccagctga ttctagggtt 1200 atttctgtcc gttccttata tcctcaggtg gatatttact ccttttgcat cattaggaat 1260 aggctcagtg ctttctttga actgattttt tgtttctttg tctctgcag 1309 73 1124 DNA Homo sapiens 73 gtaagttgct gtctttctgg cacgtttagc tcagggggag gatggtgttg taggtgtctt 60 ggattgaaga aagccttggg gattgtttgt cactcacaca cttgtgggtg ccatctcact 120 gtgaggagga cagaagccct gtgaacatgt ggagcacaca ggggcacaga cagatttaga 180 ttaggcctgc tttatagagt ttctgcctag agcatcatgg ctcagtgccc agcagcccct 240 ccagaggcct ctgaaatatt tgatatactg atttccttga ggagaatcag aaatctcctg 300 caggtgtcta gggatttcaa gtaagtagtg ttgtgagggg aatacctact tgtactttcc 360 ccccaaacca gattcccgag gcttcttaag gactcaagga caatttctag gcatttagca 420 cgggactaaa aaggtcttag aggaaataag aagcgccaaa accatctctt tgcactgtat 480 ttcaacccat ttgtccttct gggttttgaa ggaacaggtg ggactgggga cagaagagtt 540 cttgaagcca gtttgtccat catggaaaat gagataggtg atgtggctac gtcagggggc 600 ccgaaggctc cttgttactg atttccgtct tttctctctg ccttttcccc aagggccagg 660 acccctggat ctctgggcag agcagacgca ggcccctata atagccctca tgctagaaag 720 gagccggagc ctgtgtataa ggccagcgca gcctactctg gacagtgcag ggttcccact 780 ctcccaactc cccatctgct tgcctccaga cccacattca cacacgagcc actgggttgg 840 aggagcatct gtgagatgaa acaccattct ttcctcaatg tctcagctat ctaactgtgt 900 gtgtaatcag gccaggtcct ccctgctggg cagaaaccat gggagttaag agattgccaa 960 catttattag aggaagctga cgtgtaactt ctctgaggca aaatttagcc ctcctttgaa 1020 caggaatttg actcagtgaa ccttgtacac actcgcactg agtctgctgc tgatgatact 1080 gtgcacccca ctgtctgggt tttaatgtca ggctgttctt ttag 1124 74 1473 DNA Homo sapiens 74 gtaaaatatc tatcgtaaga tgtatcagaa aaatgggcat gtagctgctg ggatatagga 60 gtagttggca ggttaaacgg atcacctggc agctcattgt tctgaatatg ttggcataca 120 gagccgtctt tggcatttag cgatttgagc cagacaaaac tgaattactt agttgtacgt 180 ttaaaagtgt aggtcaaaaa caaatccaga ggccaggagc tgtggctcat gcctgtaatc 240 ctagcacttt gggaggccga agcgggtgga tcacttgagg tcaggagttc gagaccagcc 300 tggcctacat gacaaaaccc cgtatctact aaaaatacaa aaaaattagc tgggcttggt 360 ggcacacacc tgtaatccca gctacttggg aggctgaggc aggagaattg cttgaaccct 420 gtaggaagag gttgtagtga gccaagatcg caccgttgca ctccagcctg ggcaacaaga 480 gcaaaactcc atctcaaaaa acaaattaaa tccagagatt taaaagctct cagaggctgg 540 gcgcggtggc ttacacctgt tatcccagca ttttgggatg ccgaggcggg caaagcacaa 600 ggtcaggagt ttgagaccag cctggccaac atagtgaaac cctgtctctg ctaaaaacat 660 agaaaaatta gccgggcatg gtggcgtgcg cctgtaatcc cagctactcg ggaggctgag 720 gtgagagaat tacttgaacc cgggaggcgg aggttgcagt gagcccagat tgcaccactg 780 cactccagcc tgggcgacag agcaagactc catctcaaaa aaagctctca gaacaaccag 840 gtttacaaat ttggtcagtt ggtaaataaa ctgggtttca aacatacttt gctgaaacaa 900 tcactgacta aataggaaat gaatcttttt tttttttttt taagctggca agctggtctg 960 taggacctga taagtactca cttcatttct ctgtgtctca ggtttcccat ttttaggtga 1020 gaattaaggg gctctgataa aacagaccct aggattgtgg acagcagtga tagtcctaga 1080 gtccacaagt ctgcttttga gtgatgggcc catgtatctg gcacatctgc aggcagagcg 1140 tggttctggc tcttcagatg atgccggtgg agcactttga ggagtcctca ccccaccgtg 1200 ataaccagac attaaaatct tggggctttg catcccagga tttctctgtg attccttcta 1260 gacttgtggc atcatggcag catcactgct gtagatttct agtcacttgg ttctcaggag 1320 ccgtttattt aatggcttca catttaattt cagtgaacaa ggtagtggca ttgctcttca 1380 cagggccgtc ctgttgtcca caggttccag attgactgtt gccccttatc tatgtgaaca 1440 gtcacaactg aggcaggttt ctgttgttta cag 1473 75 292 DNA Homo sapiens 75 gtaaaccgct gtctttgttc tagtagcttt ttgatgaaca ataatcctta tgtttcctgg 60 agtactttca actcatggta aagttggcag gggcattcac aacagaaaag agcaaactat 120 taactttacc agtgaggcag tacggtgtag tgtagtgatt cagagaattt gctttgccac 180 cagacatacc aggtaacctt gactaagtta cttaacctat ctaaacctca gttccctcat 240 ctgtgaaatg gagacagtaa tcatagctat ttccaaactg ttgtgagaat tc 292 76 235 DNA Homo sapiens 76 gaattcaatg agttaaaggt ataaggtcct caccacagcg cctgcccaca tagtcagtga 60 tcactatgtc ctgaacactg taattacttc gccatattct ctgatcatag tgttttgcct 120 tggtatgtga ctagaatttc tttctgaggt ttatgggcat ggttggtggg tatgcacctg 180 cctgcaggag cccggtttgg gggcattacc ttgtacctgg tatgttttct ttcag 235 77 240 DNA Homo sapiens 77 gtaagtgtgg ctgtgtctgt atagatggag tggggcaagg gagagggtta tggagaaggg 60 gagaaaaatg tgaatctcat tgtaggggaa cagctgcaga gaccgttata ttatgataaa 120 tctggattga tccaggctct gggcagaagt gataagttta cgaattggct ggttgggctt 180 cttgaactgc agaagagaaa atgacactga tatgtaaaaa tcgtaacatt tagtgaattc 240 78 988 DNA Homo sapiens 78 gaattcatat aaagtgagtt caaaaattgt taattaaatt ataatttaat tataagtgtt 60 taatcagttt gatttgttta aaaaccactg ttttaaattt ggtggaatat gtttttatta 120 gcttgtatct ttaattccta aattaagctg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg 180 tgtgtgtgtg aagtttaaag ccaggatgag ctagtttaaa gtatgcagcc tttggagtca 240 tacagatctg ggtttgaatc tggtctctaa actttataga tgtatgatat taaatgaggc 300 agttcatgta aattgccaag cccagcactc agcacagagt tgatatttca cacacattag 360 atacctttcc tgtatgtgga gcatggcagt tcctgtttct gctttactcc tacaggatac 420 taatatagga cactaggatc tttataccaa gaccccatgt aatgggctta tgagaccatt 480 cttcttataa aaatctgaca gaatttttgt atgtgttaga tcaataggct gcatactgtt 540 attttcaagt tgatttacag ccagaaatat taatttattt gagtagttac agagtaatat 600 ttctgctctc atttagtttt caagccccac tagtcctttg tgtgtgaaaa tttacaactt 660 actgctctta caaggtcatg aacagtggac caaagtgaat gccattaacc actctgactt 720 ccttcattag ttttattgtg acagtggact cttttgacct cagtaatacc agtttggcat 780 ttacattgtc atatttttag acttaaaaat gatcatctta accctgaata aaatgtgtct 840 ggtgaacaga tgttttttct tgggctgtgc ctcagatatc tcctgtgtgt gtgtacgtgt 900 gtgtttgtct gtgtgtccat gtcctcactg attgagccct agctgcatca aaagacccct 960 cagattttca cacgcttttt ctctccag 988 79 498 DNA Homo sapiens 79 gtaaggacac aggcctgctg tatctttctg atgtctgtca gggccatgga ttgatatgga 60 taagaaagaa agagctctgg ctatcatcag gaaatgttcc agctactcta aagatgtatg 120 aaaaagaaat agccagaggc aggtgatcac tttcatgaca ccaaacacag cattgggtac 180 cagagttcat gtcacaccag agggaaaatt ctgtacacaa tgatgaaaat taataccact 240 accacttaag ttcctatgtg acaactttcc caagaatcag agagatacaa gtcaaaactc 300 caagtcaatg cctctaactt ctctgatggg ttttaacctc cagagtcaga atgttctttg 360 ccttactagg aaagccatct gtcatttgaa aactctgtac attttatcag cagcttatcc 420 atccattgca aatatgtttt tgtgccagcc acaatatatt gcttctattt ggaccaatag 480 ggggatttga aggaattc 498 80 544 DNA Homo sapiens 80 gaattctcat aattgtccta tcgtcaagtc tttatttctg cattttactg cttgatacac 60 tgtcaggaca gactttaaaa ttattctcag tgcgatgaaa caattctgac attcatgtta 120 tgagcagtta cctcataaat agattacatg tgagattgaa cttgggcaga ctataatata 180 gcattaatga cgaaacagac acagtcatct tcgggaagaa gaatagaggc ttatttgctg 240 cctgtgaaat taaaattact ctgactggga atccatcgtt cagtaagttt actgagtgtg 300 acaccttggc ttgactgttg gaaagacaga aagggcatgt agtttataaa atcagccaag 360 gggaaaatgc ttgtcaaaat gtattgtcgg gtattttgat taatagttta tgtggcttca 420 ttaattcaga gttactctcc aatatgttta tctgcccttt cttgtctgat aatggtgaaa 480 acttgtgtga tgcattgtat atttgattta ggggtgaact ggatgtcttt gttttcactt 540 ttag 544 81 111 DNA Homo sapiens 81 gtaagtcacc tctgagtgag ggagctgcac agtggataag gcatttggtg cccagtgtca 60 gaaggagggc agggactctc agtagacact tatctttttg tgtctcaaca g 111 82 363 DNA Homo sapiens 82 gtgagtcatg cagagagaac actcctgctg ggatgagcat ctctgggagc cagaggacag 60 tgtttaattg tgatcttatt ccacttgtca gtggtattga cactgctgac tgccttgtcc 120 tgtcttcaga gtctgtcttc cctgagaagg caaagcacct ttctttcttg ctgtgcctta 180 cattttgctg gtcaagcctt tcagtttctt ttgacagttt tttttacttc tttctttttt 240 caatgttgct cttaccaaga gtagctcctc tgccttccac tttacacatg agagctgggc 300 gacgccattc agtcctaagg cttttaccat cacctctctt ggtgttttta ttgtcatctc 360 taa 363 83 434 DNA Homo sapiens 83 tttaattgat tcactaggat atatgctact gaaaggggaa tctgcttaaa gtgctttctg 60 atatttatta ttactaaaac ttagaattta ttaaaaatac tgactgtgaa aaattacttg 120 ggtcgtttgc ctttttaaaa ggatttttgg catgtctcat taaaaaaaga aatactagat 180 atcttcagtg aagttacaaa tcgaatacac attggctctg aaattctgat tgatactggg 240 tcataaaaag ttttcccaaa tcagacttgg aaagtgatca ctctcttgtt actctttttt 300 ccttgtcatg ggtgatagcc atttgtgttt attggaagat cggtgaattt taaggaacat 360 aggcccaaat ttgaggaagg gccatggttt ttgatccctc cattctgacc ggatctctgc 420 attgtgtcta ctag 434 84 264 DNA Homo sapiens 84 gtgagctttt tcttagaacc cgtggagcac ctggttgagg gtcacagagg aggcgcacag 60 ggaaacactc accaatgggg gttgcattga actgaactca aaatatgtga taaaactgat 120 tttcctgatg tgggcatccc gcagccccct ccctgcccat cctggagact gtggcaagta 180 ggttttataa tactacgtta gagactgaat ctttgtcctg aaaaatagtt tgaaaggttc 240 atttttcttg ttttttcccc caag 264 85 175 DNA Homo sapiens 85 gtgagagagt acaggttaca atagctcatc ttcagttttt ttcagcttta tgtgctgtaa 60 cccagcagtt tgctgacttg cttaataaaa gggcatgtgt tcccaaaatg tacatctata 120 ccaaggttct gtcaatttta ttttaaaaac accatggaga cttcttaaag aattc 175 86 588 DNA Homo sapiens 86 gaattcccat tctcgaatac attggtttta tatgcttaca tttatgtgtt agttattaaa 60 acatactaat attgtatatc tagtcaaaac tgaggtagag agaataaatg gttgattttg 120 agtttgagtt tcatagtcca aaaagctgat atattgcctg tgttcaagag ggtctatatc 180 agccctctag atgccagcat ctccaaattt tacttttttg gaatctgtac agtatttgca 240 atatttttat tacaaatttc tactctgtgg aatttaattt ttaaaatacc tgcaatacat 300 atatatgttg aatagatgaa aaattatgta gataataatg aatgatacgg ttctaaaaag 360 acaggttaaa aagtaagttc acttttattt tgagcttcag aatcattcag aagccagtcg 420 ccacaaacgc agaccaaggc tcttggcaca tcaaatatgc ctatggctta gggttattga 480 caagtcttat gttgcagtgt atgtggttta tagtcctgcc ttccacagtt gcttgggaga 540 gctgtgagtc actgaggctt atgaatgttt acattttgtt tgttgcag 588 87 78 DNA Homo sapiens 87 gtaaagacac tttgtctata ttgcgtttgt ccctattagt tcagactatc tctacccaat 60 caagcaacga tgctcgtt 78 88 376 DNA Homo sapiens 88 acatgtgcca gtactggtga gagcgcaagc tttggagtca aacacaaatg ggtttgcatc 60 ctggccctac caattatgag ctctgagcca tgggcaagtg actaactccc tgggcctcag 120 tttctctgta acatctgtca gacttcatgg gtccaggtga ggattaaagg agatcatgta 180 tttacagcac atggcatggt gcttcacata aaataagtat ttagtaaatg ataactggtt 240 ccttctctca gaaacttatt tctgggcctg ccaggggccg ccctttttca tggcacaagt 300 tgggttccca gggttcagta ttcttttaaa tagttttctg gagatcctcc atttgggtat 360 tttttcctgc tttcag 376 89 111 DNA Homo sapiens 89 gtaagctttg agtgtcaaaa cagatttact tctcagggtg tggattcctg ccccgacact 60 cccgcccata ggtccaagag cagtttgtat cttgaattgg tgcttgaatt c 111 90 2264 DNA Homo sapiens 90 ctgtatagta atcattgact aaagccattt gtctgtgttt tcttcttgtg gttgtatata 60 tcaggtaaaa tattttccaa agagccatgt gtcatgtaat actgaaccac tttgatattg 120 agacattaat ttgtaccctg tgttattatc tactagtaat aatgtaatac tgtagaaata 180 ttgctctaat tcttttcaaa attgttgcat cccccttaga atgtttctat ttccataagg 240 atttaggtat gctattatcc cttcttatac cctaagatga agctgttttt gtgctctttg 300 ttcatcattg gccctcattc caagcacttt acgctgtctg taatgggatc tatttttgca 360 ctggaatatc tgagaattgc aaaactagac aaaagtttca caacagattt tctaagttaa 420 atcattttca ttaaaaggaa aaaagaaaaa aaatttttgt atgtcaataa cctttatatg 480 aagtattaaa atgcatattt ctatgttgta atataatgag tcacaaaata aagctgtgac 540 agttctgttg gtctacagaa atttactttt gtgcatttgt ggcaccacct actgttgaag 600 ggttataaag ccattagaaa agtagagggg aagtgatttg gatcaaaagg aaaaacttta 660 gaaaagattc aaatgttccc ttaatcataa aagagaactg aggggactac ttgaaaataa 720 aaggttgttt tgtattttca tgttggttaa gatactgagt aactggtatt aagtgttaga 780 ggtttttaga taaatattct gcttaatgat tatgaagctg cactgagatt tctgaaaatg 840 ctctgtagct gagcttattt aataaatgtt cacttggtat aggggaagct acaaaggcag 900 ccttcagtgt ccttttgttt attcaaccaa aaatataagg acacaatgta gcagttatac 960 tgggaaggtg ctgggggtgg tggcaatggt gagcaggaag gcgaagtaga tatggaaaca 1020 gaaatgatac taatatcggt gattccttcc ttttttcctg taataagtgc tgtgcagaca 1080 acatatgagc agtgctgata aatgtaaatg tatttttcat agctcattaa gaatcagttt 1140 cagaaagaga tgtctgctta ttttgctact tgaagaatcc ctgtcaaaca gtccttttga 1200 ggaagtacaa gaggctgtct ctatttgtga cctcaggaat ggctgtgaca gtgtcgtgag 1260 cagtcctttt cctgtggcac agatctgaac tttgtgtgca gaaaaatctt ggcttcaagt 1320 gagccaagat gccccctgag catcagcatc acaacttcat cctcctatct tgaagttcat 1380 gttatagtga ctttaatgaa atcatagaac actgtttctt cgtgaacaat gacgagggag 1440 aggaaaaaac tttattgaaa aataaaaagg caggtaattt agatgaaaat atgttaccca 1500 tgaggttttg tttttgcttt ttgtttttgt ttttgagaaa cagaatctcg ctctgtcgtc 1560 caggctggag tgcagcggca tgatcttggc tcactgcaac ctccgcctcc cgggttcaag 1620 cgattctcct cagcttccca agtagctggt actacaggca tgcgccacca caaccagcta 1680 atttttgtat ttttagtaga gatggggttt cactatacgt tggccaggct ggtctcaaac 1740 tcctgaccta aggtgatcct tctgccttgg gctcccaaag tgctgggatt acaggcatga 1800 gccaccttgc ctggccctac ccatgagcct tgactaaaac attcttctat ctgtagaaaa 1860 gcccaaaaga acttttccag attcaaaaaa cttggcactt tgtaatggta atgtttacat 1920 taagtaaaaa aaaaaaaaaa aaacccactt agcttcagtt ttcaagtgtt tactgtgttg 1980 tcatgcactt catttaattc tcaacacctg ccctatgagg taaaaagtac cattttacat 2040 atgagtaaat tacagctcag tggataagaa actcgtccaa aggtacaggt tcagtcaagt 2100 ggcagagggt tctttttgtt gaagttaggt atcagttaaa attgaccttg taaaatcaca 2160 tcagcatcaa tatacattaa tttaacaaat atttattgaa ctttactgta tgccagatac 2220 ttctctaggt actagggggt acaatgtaga agaaaataga attc 2264 91 9497 DNA Homo sapiens misc_feature (6765)..(6765) n is a, c, g, or t 91 caaacatgtc agctgttact ggaagtggcc tggcctctat ttatcttcct gatcctgatc 60 tctgttcggc tgagctaccc accctatgaa caacatgaat gccattttcc aaataaagcc 120 atgccctctg caggaacact tccttgggtt caggggatta tctgtaatgc caacaacccc 180 tgtttccgtt acccgactcc tggggaggct cccggagttg ttggaaactt taacaaatcc 240 attgtggctc gcctgttctc agatgctcgg aggcttcttt tatacagcca gaaagacacc 300 agcatgaagg acatgcgcaa agttctgaga acattacagc agatcaagaa atccagctca 360 aacttgaagc ttcaagattt cctggtggac aatgaaacct tctctgggtt cctgtatcac 420 aacctctctc tcccaaagtc tactgtggac aagatgctga gggctgatgt cattctccac 480 aaggtatttt tgcaaggcta ccagttacat ttgacaagtc tgtgcaatgg atcaaaatca 540 gaagagatga ttcaacttgg tgaccaagaa gtttctgagc tttgtggcct accaagggag 600 aaactggctg cagcagagcg agtacttcgt tccaacatgg acatcctgaa gccaatcctg 660 agaacactaa actctacatc tcccttcccg agcaaggagc tggccgaagc cacaaaaaca 720 ttgctgcata gtcttgggac tctggcccag gagctgttca gcatgagaag ctggagtgac 780 atgcgacagg aggtgatgtt tctgaccaat gtgaacagct ccagctcctc cacccaaatc 840 taccaggctg tgtctcgtat tgtctgcggg catcccgagg gaggggggct gaagatcaag 900 tctctcaact ggtatgagga caacaactac aaagccctct ttggaggcaa tggcactgag 960 gaagatgctg aaaccttcta tgacaactct acaactcctt actgcaatga tttgatgaag 1020 aatttggagt ctagtcctct ttcccgcatt atctggaaag ctctgaagcc gctgctcgtt 1080 gggaagatcc tgtatacacc tgacactcca gccacaaggc aggtcatggc tgaggtgaac 1140 aagaccttcc aggaactggc tgtgttccat gatctggaag gcatgtggga ggaactcagc 1200 cccaagatct ggaccttcat ggagaacagc caagaaatgg accttgtccg gatgctgttg 1260 gacagcaggg acaatgacca cttttgggaa cagcagttgg atggcttaga ttggacagcc 1320 caagacatcg tggcgttttt ggccaagcac ccagaggatg tccagtccag taatggttct 1380 gtgtacacct ggagagaagc tttcaacgag actaaccagg caatccggac catatctcgc 1440 ttcatggagt gtgtcaacct gaacaagcta gaacccatag caacagaagt ctggctcatc 1500 aacaagtcca tggagctgct ggatgagagg aagttctggg ctggtattgt gttcactgga 1560 attactccag gcagcattga gctgccccat catgtcaagt acaagatccg aatggacatt 1620 gacaatgtgg agaggacaaa taaaatcaag gatgggtact gggaccctgg tcctcgagct 1680 gacccctttg aggacatgcg gtacgtctgg gggggcttcg cctacttgca ggatgtggtg 1740 gagcaggcaa tcatcagggt gctgacgggc accgagaaga aaactggtgt ctatatgcaa 1800 cagatgccct atccctgtta cgttgatgac atctttctgc gggtgatgag ccggtcaatg 1860 cccctcttca tgacgctggc ctggatttac tcagtggctg tgatcatcaa gggcatcgtg 1920 tatgagaagg aggcacggct gaaagagacc atgcggatca tgggcctgga caacagcatc 1980 ctctggttta gctggttcat tagtagcctc attcctcttc ttgtgagcgc tggcctgcta 2040 gtggtcatcc tgaagttagg aaacctgctg ccctacagtg atcccagcgt ggtgtttgtc 2100 ttcctgtccg tgtttgctgt ggtgacaatc ctgcagtgct tcctgattag cacactcttc 2160 tccagagcca acctggcagc agcctgtggg ggcatcatct acttcacgct gtacctgccc 2220 tacgtcctgt gtgtggcatg gcaggactac gtgggcttca cactcaagat cttcgctagc 2280 ctgctgtctc ctgtggcttt tgggtttggc tgtgagtact ttgccctttt tgaggagcag 2340 ggcattggag tgcagtggga caacctgttt gagagtcctg tggaggaaga tggcttcaat 2400 ctcaccactt cggtctccat gatgctgttt gacaccttcc tctatggggt gatgacctgg 2460 tacattgagg ctgtctttcc aggccagtac ggaattccca ggccctggta ttttccttgc 2520 accaagtcct actggtttgg cgaggaaagt gatgagaaga gccaccctgg ttccaaccag 2580 aagagaatat cagaaatctg catggaggag gaacccaccc acttgaagct gggcgtgtcc 2640 attcagaacc tggtaaaagt ctaccgagat gggatgaagg tggctgtcga tggcctggca 2700 ctgaattttt atgagggcca gatcacctcc ttcctgggcc acaatggagc ggggaagacg 2760 accaccatgt caatcctgac cgggttgttc cccccgacct cgggcaccgc ctacatcctg 2820 ggaaaagaca ttcgctctga gatgagcacc atccggcaga acctgggggt ctgtccccag 2880 cataacgtgc tgtttgacat gctgactgtc gaagaacaca tctggttcta tgcccgcttg 2940 aaagggctct ctgagaagca cgtgaaggcg gagatggagc agatggccct ggatgttggt 3000 ttgccatcaa gcaagctgaa aagcaaaaca agccagctgt caggtggaat gcagagaaag 3060 ctatctgtgg ccttggcctt tgtcggggga tctaaggttg tcattctgga tgaacccaca 3120 gctggtgtgg acccttactc ccgcagggga atatgggagc tgctgctgaa ataccgacaa 3180 ggccgcacca ttattctctc tacacaccac atggatgaag cggacgtcct gggggacagg 3240 attgccatca tctcccatgg gaagctgtgc tgtgtgggct cctccctgtt tctgaagaac 3300 cagctgggaa caggctacta cctgaccttg gtcaagaaag atgtggaatc ctccctcagt 3360 tcctgcagaa acagtagtag cactgtgtca tacctgaaaa aggaggacag tgtttctcag 3420 agcagttctg atgctggcct gggcagcgac catgagagtg acacgctgac catcgatgtc 3480 tctgctatct ccaacctcat caggaagcat gtgtctgaag cccggctggt ggaagacata 3540 gggcatgagc tgacctatgt gctgccatat gaagctgcta aggagggagc ctttgtggaa 3600 ctctttcatg agattgatga ccggctctca gacctgggca tttctagtta tggcatctca 3660 gagacgaccc tggaagaaat attcctcaag gtggccgaag agagtggggt ggatgctgag 3720 acctcagatg gtaccttgcc agcaagacga aacaggcggg ccttcgggga caagcagagc 3780 tgtcttcgcc cgttcactga agatgatgct gctgatccaa atgattctga catagaccca 3840 gaatccagag agacagactt gctcagtggg atggatggca aagggtccta ccaggtgaaa 3900 ggctggaaac ttacacagca acagtttgtg gcccttttgt ggaagagact gctaattgcc 3960 agacggagtc ggaaaggatt ttttgctcag attgtcttgc cagctgtgtt tgtctgcatt 4020 gcccttgtgt tcagcctgat cgtgccaccc tttggcaagt accccagcct ggaacttcag 4080 ccctggatgt acaacgaaca gtacacattt gtcagcaatg atgctcctga ggacacggga 4140 accctggaac tcttaaacgc cctcaccaaa gaccctggct tcgggacccg ctgtatggaa 4200 ggaaacccaa tcccagacac gccctgccag gcaggggagg aagagtggac cactgcccca 4260 gttccccaga ccatcatgga cctcttccag aatgggaact ggacaatgca gaacccttca 4320 cctgcatgcc agtgtagcag cgacaaaatc aagaagatgc tgcctgtgtg tcccccaggg 4380 gcaggggggc tgcctcctcc acaaagaaaa caaaacactg cagatatcct tcaggacctg 4440 acaggaagaa acatttcgga ttatctggtg aagacgtatg tgcagatcat agccaaaagc 4500 ttaaagaaca agatctgggt gaatgagttt aggtatggcg gcttttccct gggtgtcagt 4560 aatactcaag cacttcctcc gagtcaagaa gttaatgatg ccaccaaaca aatgaagaaa 4620 cacctaaagc tggccaagga cagttctgca gatcgatttc tcaacagctt gggaagattt 4680 atgacaggac tggacaccag aaataatgtc aaggtgtggt tcaataacaa gggctggcat 4740 gcaatcagct ctttcctgaa tgtcatcaac aatgccattc tccgggccaa cctgcaaaag 4800 ggagagaacc ctagccatta tggaattact gctttcaatc atcccctgaa tctcaccaag 4860 cagcagctct cagaggtggc tccgatgacc acatcagtgg atgtccttgt gtccatctgt 4920 gtcatctttg caatgtcctt cgtcccagcc agctttgtcg tattcctgat ccaggagcgg 4980 gtcagcaaag caaaacacct gcagttcatc agtggagtga agcctgtcat ctactggctc 5040 tctaattttg tctgggatat gtgcaattac gttgtccctg ccacactggt cattatcatc 5100 ttcatctgct tccagcagaa gtcctatgtg tcctccacca atctgcctgt gctagccctt 5160 ctacttttgc tgtatgggtg gtcaatcaca cctctcatgt acccagcctc ctttgtgttc 5220 aagatcccca gcacagccta tgtggtgctc accagcgtga acctcttcat tggcattaat 5280 ggcagcgtgg ccacctttgt gctggagctg ttcaccgaca ataagctgaa taatatcaat 5340 gatatcctga agtccgtgtt cttgatcttc ccacattttt gcctgggacg agggctcatc 5400 gacatggtga aaaaccaggc aatggctgat gccctggaaa ggtttgggga gaatcgcttt 5460 gtgtcaccat tatcttggga cttggtggga cgaaacctct tcgccatggc cgtggaaggg 5520 gtggtgttct tcctcattac tgttctgatc cagtacagat tcttcatcag gcccagacct 5580 gtaaatgcaa agctatctcc tctgaatgat gaagatgaag atgtgaggcg ggaaagacag 5640 agaattcttg atggtggagg ccagaatgac atcttagaaa tcaaggagtt gacgaagata 5700 tatagaagga agcggaagcc tgctgttgac aggatttgcg tgggcattcc tcctggtgag 5760 tgctttgggc tcctgggagt taatggggct ggaaaatcat caactttcaa gatgttaaca 5820 ggagatacca ctgttaccag aggagatgct ttccttaaca gaaatagtat cttatcaaac 5880 atccatgaag tacatcagaa catgggctac tgccctcagt ttgatgccat cacagagctg 5940 ttgactggga gagaacacgt ggagttcttt gcccttttga gaggagtccc agagaaagaa 6000 gttggcaagg ttggtgagtg ggcgattcgg aaactgggcc tcgtgaagta tggagaaaaa 6060 tatgctggta actatagtgg aggcaacaaa cgcaagctct ctacagccat ggctttgatc 6120 ggcgggcctc ctgtggtgtt tctggatgaa cccaccacag gcatggatcc caaagcccgg 6180 cggttcttgt ggaattgtgc cctaagtgtt gtcaaggagg ggagatcagt agtgcttaca 6240 tctcatagta tggaagaatg tgaagctctt tgcactagga tggcaatcat ggtcaatgga 6300 aggttcaggt gccttggcag tgtccagcat ctaaaaaata ggtttggaga tggttataca 6360 atagttgtac gaatagcagg gtccaacccg gacctgaagc ctgtccagga tttctttgga 6420 cttgcatttc ctggaagtgt tccaaaagag aaacaccgga acatgctaca ataccagctt 6480 ccatcttcat tatcttctct ggccaggata ttcagcatcc tctcccagag caaaaagcga 6540 ctccacatag aagactactc tgtttctcag acaacacttg accaagtatt tgtgaacttt 6600 gccaaggacc aaagtgatga tgaccactta aaagacctct cattacacaa aaaccagaca 6660 gtagtggacg ttgcagttct cacatctttt ctacaggatg agaaagtgaa agaaagctat 6720 gtatgaagaa tcctgttcat acggggtggc tgaaagtaaa gaggnactag actttccttt 6780 gcaccatgtg aagtgttgtg gagaaaagag ccagaagttg atgtgggaag aagtaaactg 6840 gatactgtac tgatactatt caatgcaatg caattcaatg caatgaaaac aaaattccat 6900 tacaggggca gtgcctttgt agcctatgtc ttgtatggct ctcaagtgaa agacttgaat 6960 ttagtttttt acctatacct atgtgaaact ctattatgga acccaatgga catatgggtt 7020 tgaactcaca cttttttttt ttttttgttc ctgtgtattc tcattggggt tgcaacaata 7080 attcatcaag taatcatggc cagcgattat tgatcaaaat caaaaggtaa tgcacatcct 7140 cattcactaa gccatgccat gcccaggaga ctggtttccc ggtgacacat ccattgctgg 7200 caatgagtgt gccagagtta ttagtgccaa gtttttcaga aagtttgaag caccatggtg 7260 tgtcatgctc acttttgtga aagctgctct gctcagagtc tatcaacatt gaatatcagt 7320 tgacagaatg gtgccatgcg tggctaacat cctgctttga ttccctctga taagctgttc 7380 tggtggcagt aacatgcaac aaaaatgtgg gtgtctctag gcacgggaaa cttggttcca 7440 ttgttatatt gtcctatgct tcgagccatg ggtctacagg gtcatcctta tgagactctt 7500 aaatatactt agatcctggt aagaggcaaa gaatcaacag ccaaactgct ggggctgcaa 7560 gctgctgaag ccagggcatg ggattaaaga gattgtgcgt tcaaacctag ggaagcctgt 7620 gcccatttgt cctgactgtc tgctaacatg gtacactgca tctcaagatg tttatctgac 7680 acaagtgtat tatttctggc tttttgaatt aatctagaaa atgaaaagat ggagttgtat 7740 tttgacaaaa atgtttgtac tttttaatgt tatttggaat tttaagttct atcagtgact 7800 tctgaatcct tagaatggcc tctttgtaga accctgtggt atagaggagt atggccactg 7860 ccccactatt tttattttct tatgtaagtt tgcatatcag tcatgactag tgcctagaaa 7920 gcaatgtgat ggtcaggatc tcatgacatt atatttgagt ttctttcaga tcatttagga 7980 tactcttaat ctcacttcat caatcaaata ttttttgagt gtatgctgta gctgaaagag 8040 tatgtacgta cgtataagac tagagagata ttaagtctca gtacacttcc tgtgccatgt 8100 tattcagctc actggtttac aaatataggt tgtcttgtgg ttgtaggagc ccactgtaac 8160 aatactgggc agcctttttt ttttttttta attgcaacaa tgcaaaagcc aagaaagtat 8220 aagggtcaca agtctaaaca atgaattctt caacagggaa aacagctagc ttgaaaactt 8280 gctgaaaaac acaacttgtg tttatggcat ttagtacctt caaataattg gctttgcaga 8340 tattggatac cccattaaat ctgacagtct caaatttttc atctcttcaa tcactagtca 8400 agaaaaatat aaaaacaaca aatacttcca tatggagcat ttttcagagt tttctaaccc 8460 agtcttattt ttctagtcag taaacatttg taaaaatact gtttcactaa tacttactgt 8520 taactgtctt gagagaaaag aaaaatatga gagaactatt gtttggggaa gttcaagtga 8580 tctttcaata tcattactaa cttcttccac tttttccaaa atttgaatat taacgctaaa 8640 ggtgtaagac ttcagatttc aaattaatct ttctatattt tttaaattta cagaatatta 8700 tataacccac tgctgaaaaa gaaaaaaatg attgttttag aagttaaagt caatattgat 8760 tttaaatata agtaatgaag gcatatttcc aataactagt gatatggcat cgttgcattt 8820 tacagtatct tcaaaaatac agaatttata gaataatttc tcctcattta atatttttca 8880 aaatcaaagt tatggtttcc tcattttact aaaatcgtat tctaattctt cattatagta 8940 aatctatgag caactcctta cttcggttcc tctgatttca aggccatatt ttaaaaaatc 9000 aaaaggcact gtgaactatt ttgaagaaaa cacaacattt taatacagat tgaaaggacc 9060 tcttctgaag ctagaaacaa tctatagtta tacatcttca ttaatactgt gttacctttt 9120 aaaatagtaa ttttttacat tttcctgtgt aaacctaatt gtggtagaaa tttttaccaa 9180 ctctatactc aatcaagcaa aatttctgta tattccctgt ggaatgtacc tatgtgagtt 9240 tcagaaattc tcaaaatacg tgttcaaaaa tttctgcttt tgcatctttg ggacacctca 9300 gaaaacttat taacaactgt gaatatgaga aatacagaag aaaataataa gccctctata 9360 cataaatgcc cagcacaatt cattgttaaa aaacaaccaa acctcacact actgtatttc 9420 attatctgta ctgaaagcaa atgctttgtg actattaaat gttgcacatc attcattcaa 9480 aaaaaaaaaa aaaaaaa 9497 92 2618 DNA Homo sapiens 92 gcaatgaaaa caaaattcca ttacaggggc agtgcctttg tagcctatgt cttgtatggc 60 tctcaagtga aagacttgaa tttagttttt tacctatacc tatgtgaaac tctattatgg 120 aacccaatgg acatatgggt ttgaactcac actttttttt tttttttgtt cctgtgtatt 180 ctcattgggg ttgcaacaat aattcatcaa gtaatcatgg ccagcgatta ttgatcaaaa 240 tcaaaaggta atgcacatcc tcattcacta agccatgcca tgcccaggag actggtttcc 300 cggtgacaca tccattgctg gcaatgagtg tgccagagtt attagtgcca agtttttcag 360 aaagtttgaa gcaccatggt gtgtcatgct cacttttgtg aaagctgctc tgctcagagt 420 ctatcaacat tgaatatcag ttgacagaat ggtgccatgc gtggctaaca tcctgctttg 480 attccctctg ataagctgtt ctggtggcag taacatgcaa caaaaatgtg ggtgtctcta 540 ggcacgggaa acttggttcc attgttatat tgtcctatgc ttcgagccat gggtctacag 600 ggtcatcctt atgagactct taaatatact tagatcctgg taagaggcaa agaatcaaca 660 gccaaactgc tggggctgca agctgctgaa gccagggcat gggattaaag agattgtgcg 720 ttcaaaccta gggaagcctg tgcccatttg tcctgactgt ctgctaacat ggtacactgc 780 atctcaagat gtttatctga cacaagtgta ttatttctgg ctttttgaat taatctagaa 840 aatgaaaaga tggagttgta ttttgacaaa aatgtttgta ctttttaatg ttatttggaa 900 ttttaagttc tatcagtgac ttctgaatcc ttagaatggc ctctttgtag aaccctgtgg 960 tatagaggag tatggccact gccccactat ttttattttc ttatgtaagt ttgcatatca 1020 gtcatgacta gtgcctagaa agcaatgtga tggtcaggat ctcatgacat tatatttgag 1080 tttctttcag atcatttagg atactcttaa tctcacttca tcaatcaaat attttttgag 1140 tgtatgctgt agctgaaaga gtatgtacgt acgtataaga ctagagagat attaagtctc 1200 agtacacttc ctgtgccatg ttattcagct cactggttta caaatatagg ttgtcttgtg 1260 gttgtaggag cccactgtaa caatactggg cagccttttt tttttttttt aattgcaaca 1320 atgcaaaagc caagaaagta taagggtcac aagtctaaac aatgaattct tcaacaggga 1380 aaacagctag cttgaaaact tgctgaaaaa cacaacttgt gtttatggca tttagtacct 1440 tcaaataatt ggctttgcag atattggata ccccattaaa tctgacagtc tcaaattttt 1500 catctcttca atcactagtc aagaaaaata taaaaacaac aaatacttcc atatggagca 1560 tttttcagag ttttctaacc cagtcttatt tttctagtca gtaaacattt gtaaaaatac 1620 tgtttcacta atacttactg ttaactgtct tgagagaaaa gaaaaatatg agagaactat 1680 tgtttgggga agttcaagtg atctttcaat atcattacta acttcttcca ctttttccaa 1740 aatttgaata ttaacgctaa aggtgtaaga cttcagattt caaattaatc tttctatatt 1800 ttttaaattt acagaatatt atataaccca ctgctgaaaa agaaaaaaat gattgtttta 1860 gaagttaaag tcaatattga ttttaaatat aagtaatgaa ggcatatttc caataactag 1920 tgatatggca tcgttgcatt ttacagtatc ttcaaaaata cagaatttat agaataattt 1980 ctcctcattt aatatttttc aaaatcaaag ttatggtttc ctcattttac taaaatcgta 2040 ttctaattct tcattatagt aaatctatga gcaactcctt acttcggttc ctctgatttc 2100 aaggccatat tttaaaaaat caaaaggcac tgtgaactat tttgaagaaa acacaacatt 2160 ttaatacaga ttgaaaggac ctcttctgaa gctagaaaca atctatagtt atacatcttc 2220 attaatactg tgttaccttt taaaatagta attttttaca ttttcctgtg taaacctaat 2280 tgtggtagaa atttttacca actctatact caatcaagca aaatttctgt atattccctg 2340 tggaatgtac ctatgtgagt ttcagaaatt ctcaaaatac gtgttcaaaa atttctgctt 2400 ttgcatcttt gggacacctc agaaaactta ttaacaactg tgaatatgag aaatacagaa 2460 gaaaataata agccctctat acataaatgc ccagcacaat tcattgttaa aaaacaacca 2520 aacctcacac tactgtattt cattatctgt actgaaagca aatgctttgt gactattaaa 2580 tgttgcacat cattcattca aaaaaaaaaa aaaaaaaa 2618 93 302 DNA Homo sapiens 93 tgtgtcaacc tgaacaagct agaacccata gcaacagaag tctggctcat caacaagtcc 60 atggagctgc tggagtacag tggcgtgacc tcagctcact gcaacctctg cctcctgagt 120 tcaagtgatt ctcgtgcctc agcctcccaa gtagctggga ttacagctcc tgccaccacg 180 cccggggctg gtattgtgtt cactggaatt actccaggca gcattgagct gccccatcat 240 gtcaagtaca agatccgaat ggacattgac aatgtggaga ggacaaataa aatcaaggat 300 gg 302 94 9593 DNA Homo sapiens misc_feature (6861)..(6861) n is a, c, g, or t 94 caaacatgtc agctgttact ggaagtggcc tggcctctat ttatcttcct gatcctgatc 60 tctgttcggc tgagctaccc accctatgaa caacatgaat gccattttcc aaataaagcc 120 atgccctctg caggaacact tccttgggtt caggggatta tctgtaatgc caacaacccc 180 tgtttccgtt acccgactcc tggggaggct cccggagttg ttggaaactt taacaaatcc 240 attgtggctc gcctgttctc agatgctcgg aggcttcttt tatacagcca gaaagacacc 300 agcatgaagg acatgcgcaa agttctgaga acattacagc agatcaagaa atccagctca 360 aacttgaagc ttcaagattt cctggtggac aatgaaacct tctctgggtt cctgtatcac 420 aacctctctc tcccaaagtc tactgtggac aagatgctga gggctgatgt cattctccac 480 aaggtatttt tgcaaggcta ccagttacat ttgacaagtc tgtgcaatgg atcaaaatca 540 gaagagatga ttcaacttgg tgaccaagaa gtttctgagc tttgtggcct accaagggag 600 aaactggctg cagcagagcg agtacttcgt tccaacatgg acatcctgaa gccaatcctg 660 agaacactaa actctacatc tcccttcccg agcaaggagc tggccgaagc cacaaaaaca 720 ttgctgcata gtcttgggac tctggcccag gagctgttca gcatgagaag ctggagtgac 780 atgcgacagg aggtgatgtt tctgaccaat gtgaacagct ccagctcctc cacccaaatc 840 taccaggctg tgtctcgtat tgtctgcggg catcccgagg gaggggggct gaagatcaag 900 tctctcaact ggtatgagga caacaactac aaagccctct ttggaggcaa tggcactgag 960 gaagatgctg aaaccttcta tgacaactct acaactcctt actgcaatga tttgatgaag 1020 aatttggagt ctagtcctct ttcccgcatt atctggaaag ctctgaagcc gctgctcgtt 1080 gggaagatcc tgtatacacc tgacactcca gccacaaggc aggtcatggc tgaggtgaac 1140 aagaccttcc aggaactggc tgtgttccat gatctggaag gcatgtggga ggaactcagc 1200 cccaagatct ggaccttcat ggagaacagc caagaaatgg accttgtccg gatgctgttg 1260 gacagcaggg acaatgacca cttttgggaa cagcagttgg atggcttaga ttggacagcc 1320 caagacatcg tggcgttttt ggccaagcac ccagaggatg tccagtccag taatggttct 1380 gtgtacacct ggagagaagc tttcaacgag actaaccagg caatccggac catatctcgc 1440 ttcatggagt gtgtcaacct gaacaagcta gaacccatag caacagaagt ctggctcatc 1500 aacaagtcca tggagctgct ggagtacagt ggcgtgacct cagctcactg caacctctgc 1560 ctcctgagtt caagtgattc tcgtgcctca gcctcccaag tagctgggat tacagctcct 1620 gccaccacgc ccggggctgg tattgtgttc actggaatta ctccaggcag cattgagctg 1680 ccccatcatg tcaagtacaa gatccgaatg gacattgaca atgtggagag gacaaataaa 1740 atcaaggatg ggtactggga ccctggtcct cgagctgacc cctttgagga catgcggtac 1800 gtctgggggg gcttcgccta cttgcaggat gtggtggagc aggcaatcat cagggtgctg 1860 acgggcaccg agaagaaaac tggtgtctat atgcaacaga tgccctatcc ctgttacgtt 1920 gatgacatct ttctgcgggt gatgagccgg tcaatgcccc tcttcatgac gctggcctgg 1980 atttactcag tggctgtgat catcaagggc atcgtgtatg agaaggaggc acggctgaaa 2040 gagaccatgc ggatcatggg cctggacaac agcatcctct ggtttagctg gttcattagt 2100 agcctcattc ctcttcttgt gagcgctggc ctgctagtgg tcatcctgaa gttaggaaac 2160 ctgctgccct acagtgatcc cagcgtggtg tttgtcttcc tgtccgtgtt tgctgtggtg 2220 acaatcctgc agtgcttcct gattagcaca ctcttctcca gagccaacct ggcagcagcc 2280 tgtgggggca tcatctactt cacgctgtac ctgccctacg tcctgtgtgt ggcatggcag 2340 gactacgtgg gcttcacact caagatcttc gctagcctgc tgtctcctgt ggcttttggg 2400 tttggctgtg agtactttgc cctttttgag gagcagggca ttggagtgca gtgggacaac 2460 ctgtttgaga gtcctgtgga ggaagatggc ttcaatctca ccacttcggt ctccatgatg 2520 ctgtttgaca ccttcctcta tggggtgatg acctggtaca ttgaggctgt ctttccaggc 2580 cagtacggaa ttcccaggcc ctggtatttt ccttgcacca agtcctactg gtttggcgag 2640 gaaagtgatg agaagagcca ccctggttcc aaccagaaga gaatatcaga aatctgcatg 2700 gaggaggaac ccacccactt gaagctgggc gtgtccattc agaacctggt aaaagtctac 2760 cgagatggga tgaaggtggc tgtcgatggc ctggcactga atttttatga gggccagatc 2820 acctccttcc tgggccacaa tggagcgggg aagacgacca ccatgtcaat cctgaccggg 2880 ttgttccccc cgacctcggg caccgcctac atcctgggaa aagacattcg ctctgagatg 2940 agcaccatcc ggcagaacct gggggtctgt ccccagcata acgtgctgtt tgacatgctg 3000 actgtcgaag aacacatctg gttctatgcc cgcttgaaag ggctctctga gaagcacgtg 3060 aaggcggaga tggagcagat ggccctggat gttggtttgc catcaagcaa gctgaaaagc 3120 aaaacaagcc agctgtcagg tggaatgcag agaaagctat ctgtggcctt ggcctttgtc 3180 gggggatcta aggttgtcat tctggatgaa cccacagctg gtgtggaccc ttactcccgc 3240 aggggaatat gggagctgct gctgaaatac cgacaaggcc gcaccattat tctctctaca 3300 caccacatgg atgaagcgga cgtcctgggg gacaggattg ccatcatctc ccatgggaag 3360 ctgtgctgtg tgggctcctc cctgtttctg aagaaccagc tgggaacagg ctactacctg 3420 accttggtca agaaagatgt ggaatcctcc ctcagttcct gcagaaacag tagtagcact 3480 gtgtcatacc tgaaaaagga ggacagtgtt tctcagagca gttctgatgc tggcctgggc 3540 agcgaccatg agagtgacac gctgaccatc gatgtctctg ctatctccaa cctcatcagg 3600 aagcatgtgt ctgaagcccg gctggtggaa gacatagggc atgagctgac ctatgtgctg 3660 ccatatgaag ctgctaagga gggagccttt gtggaactct ttcatgagat tgatgaccgg 3720 ctctcagacc tgggcatttc tagttatggc atctcagaga cgaccctgga agaaatattc 3780 ctcaaggtgg ccgaagagag tggggtggat gctgagacct cagatggtac cttgccagca 3840 agacgaaaca ggcgggcctt cggggacaag cagagctgtc ttcgcccgtt cactgaagat 3900 gatgctgctg atccaaatga ttctgacata gacccagaat ccagagagac agacttgctc 3960 agtgggatgg atggcaaagg gtcctaccag gtgaaaggct ggaaacttac acagcaacag 4020 tttgtggccc ttttgtggaa gagactgcta attgccagac ggagtcggaa aggatttttt 4080 gctcagattg tcttgccagc tgtgtttgtc tgcattgccc ttgtgttcag cctgatcgtg 4140 ccaccctttg gcaagtaccc cagcctggaa cttcagccct ggatgtacaa cgaacagtac 4200 acatttgtca gcaatgatgc tcctgaggac acgggaaccc tggaactctt aaacgccctc 4260 accaaagacc ctggcttcgg gacccgctgt atggaaggaa acccaatccc agacacgccc 4320 tgccaggcag gggaggaaga gtggaccact gccccagttc cccagaccat catggacctc 4380 ttccagaatg ggaactggac aatgcagaac ccttcacctg catgccagtg tagcagcgac 4440 aaaatcaaga agatgctgcc tgtgtgtccc ccaggggcag gggggctgcc tcctccacaa 4500 agaaaacaaa acactgcaga tatccttcag gacctgacag gaagaaacat ttcggattat 4560 ctggtgaaga cgtatgtgca gatcatagcc aaaagcttaa agaacaagat ctgggtgaat 4620 gagtttaggt atggcggctt ttccctgggt gtcagtaata ctcaagcact tcctccgagt 4680 caagaagtta atgatgccac caaacaaatg aagaaacacc taaagctggc caaggacagt 4740 tctgcagatc gatttctcaa cagcttggga agatttatga caggactgga caccagaaat 4800 aatgtcaagg tgtggttcaa taacaagggc tggcatgcaa tcagctcttt cctgaatgtc 4860 atcaacaatg ccattctccg ggccaacctg caaaagggag agaaccctag ccattatgga 4920 attactgctt tcaatcatcc cctgaatctc accaagcagc agctctcaga ggtggctccg 4980 atgaccacat cagtggatgt ccttgtgtcc atctgtgtca tctttgcaat gtccttcgtc 5040 ccagccagct ttgtcgtatt cctgatccag gagcgggtca gcaaagcaaa acacctgcag 5100 ttcatcagtg gagtgaagcc tgtcatctac tggctctcta attttgtctg ggatatgtgc 5160 aattacgttg tccctgccac actggtcatt atcatcttca tctgcttcca gcagaagtcc 5220 tatgtgtcct ccaccaatct gcctgtgcta gcccttctac ttttgctgta tgggtggtca 5280 atcacacctc tcatgtaccc agcctccttt gtgttcaaga tccccagcac agcctatgtg 5340 gtgctcacca gcgtgaacct cttcattggc attaatggca gcgtggccac ctttgtgctg 5400 gagctgttca ccgacaataa gctgaataat atcaatgata tcctgaagtc cgtgttcttg 5460 atcttcccac atttttgcct gggacgaggg ctcatcgaca tggtgaaaaa ccaggcaatg 5520 gctgatgccc tggaaaggtt tggggagaat cgctttgtgt caccattatc ttgggacttg 5580 gtgggacgaa acctcttcgc catggccgtg gaaggggtgg tgttcttcct cattactgtt 5640 ctgatccagt acagattctt catcaggccc agacctgtaa atgcaaagct atctcctctg 5700 aatgatgaag atgaagatgt gaggcgggaa agacagagaa ttcttgatgg tggaggccag 5760 aatgacatct tagaaatcaa ggagttgacg aagatatata gaaggaagcg gaagcctgct 5820 gttgacagga tttgcgtggg cattcctcct ggtgagtgct ttgggctcct gggagttaat 5880 ggggctggaa aatcatcaac tttcaagatg ttaacaggag ataccactgt taccagagga 5940 gatgctttcc ttaacagaaa tagtatctta tcaaacatcc atgaagtaca tcagaacatg 6000 ggctactgcc ctcagtttga tgccatcaca gagctgttga ctgggagaga acacgtggag 6060 ttctttgccc ttttgagagg agtcccagag aaagaagttg gcaaggttgg tgagtgggcg 6120 attcggaaac tgggcctcgt gaagtatgga gaaaaatatg ctggtaacta tagtggaggc 6180 aacaaacgca agctctctac agccatggct ttgatcggcg ggcctcctgt ggtgtttctg 6240 gatgaaccca ccacaggcat ggatcccaaa gcccggcggt tcttgtggaa ttgtgcccta 6300 agtgttgtca aggaggggag atcagtagtg cttacatctc atagtatgga agaatgtgaa 6360 gctctttgca ctaggatggc aatcatggtc aatggaaggt tcaggtgcct tggcagtgtc 6420 cagcatctaa aaaataggtt tggagatggt tatacaatag ttgtacgaat agcagggtcc 6480 aacccggacc tgaagcctgt ccaggatttc tttggacttg catttcctgg aagtgttcca 6540 aaagagaaac accggaacat gctacaatac cagcttccat cttcattatc ttctctggcc 6600 aggatattca gcatcctctc ccagagcaaa aagcgactcc acatagaaga ctactctgtt 6660 tctcagacaa cacttgacca agtatttgtg aactttgcca aggaccaaag tgatgatgac 6720 cacttaaaag acctctcatt acacaaaaac cagacagtag tggacgttgc agttctcaca 6780 tcttttctac aggatgagaa agtgaaagaa agctatgtat gaagaatcct gttcatacgg 6840 ggtggctgaa agtaaagagg nactagactt tcctttgcac catgtgaagt gttgtggaga 6900 aaagagccag aagttgatgt gggaagaagt aaactggata ctgtactgat actattcaat 6960 gcaatgcaat tcaatgcaat gaaaacaaaa ttccattaca ggggcagtgc ctttgtagcc 7020 tatgtcttgt atggctctca agtgaaagac ttgaatttag ttttttacct atacctatgt 7080 gaaactctat tatggaaccc aatggacata tgggtttgaa ctcacacttt tttttttttt 7140 ttgttcctgt gtattctcat tggggttgca acaataattc atcaagtaat catggccagc 7200 gattattgat caaaatcaaa aggtaatgca catcctcatt cactaagcca tgccatgccc 7260 aggagactgg tttcccggtg acacatccat tgctggcaat gagtgtgcca gagttattag 7320 tgccaagttt ttcagaaagt ttgaagcacc atggtgtgtc atgctcactt ttgtgaaagc 7380 tgctctgctc agagtctatc aacattgaat atcagttgac agaatggtgc catgcgtggc 7440 taacatcctg ctttgattcc ctctgataag ctgttctggt ggcagtaaca tgcaacaaaa 7500 atgtgggtgt ctctaggcac gggaaacttg gttccattgt tatattgtcc tatgcttcga 7560 gccatgggtc tacagggtca tccttatgag actcttaaat atacttagat cctggtaaga 7620 ggcaaagaat caacagccaa actgctgggg ctgcaagctg ctgaagccag ggcatgggat 7680 taaagagatt gtgcgttcaa acctagggaa gcctgtgccc atttgtcctg actgtctgct 7740 aacatggtac actgcatctc aagatgttta tctgacacaa gtgtattatt tctggctttt 7800 tgaattaatc tagaaaatga aaagatggag ttgtattttg acaaaaatgt ttgtactttt 7860 taatgttatt tggaatttta agttctatca gtgacttctg aatccttaga atggcctctt 7920 tgtagaaccc tgtggtatag aggagtatgg ccactgcccc actattttta ttttcttatg 7980 taagtttgca tatcagtcat gactagtgcc tagaaagcaa tgtgatggtc aggatctcat 8040 gacattatat ttgagtttct ttcagatcat ttaggatact cttaatctca cttcatcaat 8100 caaatatttt ttgagtgtat gctgtagctg aaagagtatg tacgtacgta taagactaga 8160 gagatattaa gtctcagtac acttcctgtg ccatgttatt cagctcactg gtttacaaat 8220 ataggttgtc ttgtggttgt aggagcccac tgtaacaata ctgggcagcc tttttttttt 8280 tttttaattg caacaatgca aaagccaaga aagtataagg gtcacaagtc taaacaatga 8340 attcttcaac agggaaaaca gctagcttga aaacttgctg aaaaacacaa cttgtgttta 8400 tggcatttag taccttcaaa taattggctt tgcagatatt ggatacccca ttaaatctga 8460 cagtctcaaa tttttcatct cttcaatcac tagtcaagaa aaatataaaa acaacaaata 8520 cttccatatg gagcattttt cagagttttc taacccagtc ttatttttct agtcagtaaa 8580 catttgtaaa aatactgttt cactaatact tactgttaac tgtcttgaga gaaaagaaaa 8640 atatgagaga actattgttt ggggaagttc aagtgatctt tcaatatcat tactaacttc 8700 ttccactttt tccaaaattt gaatattaac gctaaaggtg taagacttca gatttcaaat 8760 taatctttct atatttttta aatttacaga atattatata acccactgct gaaaaagaaa 8820 aaaatgattg ttttagaagt taaagtcaat attgatttta aatataagta atgaaggcat 8880 atttccaata actagtgata tggcatcgtt gcattttaca gtatcttcaa aaatacagaa 8940 tttatagaat aatttctcct catttaatat ttttcaaaat caaagttatg gtttcctcat 9000 tttactaaaa tcgtattcta attcttcatt atagtaaatc tatgagcaac tccttacttc 9060 ggttcctctg atttcaaggc catattttaa aaaatcaaaa ggcactgtga actattttga 9120 agaaaacaca acattttaat acagattgaa aggacctctt ctgaagctag aaacaatcta 9180 tagttataca tcttcattaa tactgtgtta ccttttaaaa tagtaatttt ttacattttc 9240 ctgtgtaaac ctaattgtgg tagaaatttt taccaactct atactcaatc aagcaaaatt 9300 tctgtatatt ccctgtggaa tgtacctatg tgagtttcag aaattctcaa aatacgtgtt 9360 caaaaatttc tgcttttgca tctttgggac acctcagaaa acttattaac aactgtgaat 9420 atgagaaata cagaagaaaa taataagccc tctatacata aatgcccagc acaattcatt 9480 gttaaaaaac aaccaaacct cacactactg tatttcatta tctgtactga aagcaaatgc 9540 tttgtgacta ttaaatgttg cacatcattc attcaaaaaa aaaaaaaaaa aaa 9593 95 173 DNA Homo sapiens 95 ctgggaccct ggtcctcgag ctgacccctt tgaggacatg cggtacgtct gggggggctt 60 cgcctacttg caggatgtgg tggagcaggc aatcatcagg gtgctacggg caccgagaag 120 aaaactggtg tctatatgca acagatgccc tatccctgtt acgttgatga cat 173 96 9495 DNA Homo sapiens misc_feature (6763)..(6763) n is a, c, g, or t 96 caaacatgtc agctgttact ggaagtggcc tggcctctat ttatcttcct gatcctgatc 60 tctgttcggc tgagctaccc accctatgaa caacatgaat gccattttcc aaataaagcc 120 atgccctctg caggaacact tccttgggtt caggggatta tctgtaatgc caacaacccc 180 tgtttccgtt acccgactcc tggggaggct cccggagttg ttggaaactt taacaaatcc 240 attgtggctc gcctgttctc agatgctcgg aggcttcttt tatacagcca gaaagacacc 300 agcatgaagg acatgcgcaa agttctgaga acattacagc agatcaagaa atccagctca 360 aacttgaagc ttcaagattt cctggtggac aatgaaacct tctctgggtt cctgtatcac 420 aacctctctc tcccaaagtc tactgtggac aagatgctga gggctgatgt cattctccac 480 aaggtatttt tgcaaggcta ccagttacat ttgacaagtc tgtgcaatgg atcaaaatca 540 gaagagatga ttcaacttgg tgaccaagaa gtttctgagc tttgtggcct accaagggag 600 aaactggctg cagcagagcg agtacttcgt tccaacatgg acatcctgaa gccaatcctg 660 agaacactaa actctacatc tcccttcccg agcaaggagc tggccgaagc cacaaaaaca 720 ttgctgcata gtcttgggac tctggcccag gagctgttca gcatgagaag ctggagtgac 780 atgcgacagg aggtgatgtt tctgaccaat gtgaacagct ccagctcctc cacccaaatc 840 taccaggctg tgtctcgtat tgtctgcggg catcccgagg gaggggggct gaagatcaag 900 tctctcaact ggtatgagga caacaactac aaagccctct ttggaggcaa tggcactgag 960 gaagatgctg aaaccttcta tgacaactct acaactcctt actgcaatga tttgatgaag 1020 aatttggagt ctagtcctct ttcccgcatt atctggaaag ctctgaagcc gctgctcgtt 1080 gggaagatcc tgtatacacc tgacactcca gccacaaggc aggtcatggc tgaggtgaac 1140 aagaccttcc aggaactggc tgtgttccat gatctggaag gcatgtggga ggaactcagc 1200 cccaagatct ggaccttcat ggagaacagc caagaaatgg accttgtccg gatgctgttg 1260 gacagcaggg acaatgacca cttttgggaa cagcagttgg atggcttaga ttggacagcc 1320 caagacatcg tggcgttttt ggccaagcac ccagaggatg tccagtccag taatggttct 1380 gtgtacacct ggagagaagc tttcaacgag actaaccagg caatccggac catatctcgc 1440 ttcatggagt gtgtcaacct gaacaagcta gaacccatag caacagaagt ctggctcatc 1500 aacaagtcca tggagctgct ggatgagagg aagttctggg ctggtattgt gttcactgga 1560 attactccag gcagcattga gctgccccat catgtcaagt acaagatccg aatggacatt 1620 gacaatgtgg agaggacaaa taaaatcaag gatgggtact gggaccctgg tcctcgagct 1680 gacccctttg aggacatgcg gtacgtctgg gggggcttcg cctacttgca ggatgtggtg 1740 gagcaggcaa tcatcagggt gctcgggcac cgagaagaaa actggtgtct atatgcaaca 1800 gatgccctat ccctgttacg ttgatgacat ctttctgcgg gtgatgagcc ggtcaatgcc 1860 cctcttcatg acgctggcct ggatttactc agtggctgtg atcatcaagg gcatcgtgta 1920 tgagaaggag gcacggctga aagagaccat gcggatcatg ggcctggaca acagcatcct 1980 ctggtttagc tggttcatta gtagcctcat tcctcttctt gtgagcgctg gcctgctagt 2040 ggtcatcctg aagttaggaa acctgctgcc ctacagtgat cccagcgtgg tgtttgtctt 2100 cctgtccgtg tttgctgtgg tgacaatcct gcagtgcttc ctgattagca cactcttctc 2160 cagagccaac ctggcagcag cctgtggggg catcatctac ttcacgctgt acctgcccta 2220 cgtcctgtgt gtggcatggc aggactacgt gggcttcaca ctcaagatct tcgctagcct 2280 gctgtctcct gtggcttttg ggtttggctg tgagtacttt gccctttttg aggagcaggg 2340 cattggagtg cagtgggaca acctgtttga gagtcctgtg gaggaagatg gcttcaatct 2400 caccacttcg gtctccatga tgctgtttga caccttcctc tatggggtga tgacctggta 2460 cattgaggct gtctttccag gccagtacgg aattcccagg ccctggtatt ttccttgcac 2520 caagtcctac tggtttggcg aggaaagtga tgagaagagc caccctggtt ccaaccagaa 2580 gagaatatca gaaatctgca tggaggagga acccacccac ttgaagctgg gcgtgtccat 2640 tcagaacctg gtaaaagtct accgagatgg gatgaaggtg gctgtcgatg gcctggcact 2700 gaatttttat gagggccaga tcacctcctt cctgggccac aatggagcgg ggaagacgac 2760 caccatgtca atcctgaccg ggttgttccc cccgacctcg ggcaccgcct acatcctggg 2820 aaaagacatt cgctctgaga tgagcaccat ccggcagaac ctgggggtct gtccccagca 2880 taacgtgctg tttgacatgc tgactgtcga agaacacatc tggttctatg cccgcttgaa 2940 agggctctct gagaagcacg tgaaggcgga gatggagcag atggccctgg atgttggttt 3000 gccatcaagc aagctgaaaa gcaaaacaag ccagctgtca ggtggaatgc agagaaagct 3060 atctgtggcc ttggcctttg tcgggggatc taaggttgtc attctggatg aacccacagc 3120 tggtgtggac ccttactccc gcaggggaat atgggagctg ctgctgaaat accgacaagg 3180 ccgcaccatt attctctcta cacaccacat ggatgaagcg gacgtcctgg gggacaggat 3240 tgccatcatc tcccatggga agctgtgctg tgtgggctcc tccctgtttc tgaagaacca 3300 gctgggaaca ggctactacc tgaccttggt caagaaagat gtggaatcct ccctcagttc 3360 ctgcagaaac agtagtagca ctgtgtcata cctgaaaaag gaggacagtg tttctcagag 3420 cagttctgat gctggcctgg gcagcgacca tgagagtgac acgctgacca tcgatgtctc 3480 tgctatctcc aacctcatca ggaagcatgt gtctgaagcc cggctggtgg aagacatagg 3540 gcatgagctg acctatgtgc tgccatatga agctgctaag gagggagcct ttgtggaact 3600 ctttcatgag attgatgacc ggctctcaga cctgggcatt tctagttatg gcatctcaga 3660 gacgaccctg gaagaaatat tcctcaaggt ggccgaagag agtggggtgg atgctgagac 3720 ctcagatggt accttgccag caagacgaaa caggcgggcc ttcggggaca agcagagctg 3780 tcttcgcccg ttcactgaag atgatgctgc tgatccaaat gattctgaca tagacccaga 3840 atccagagag acagacttgc tcagtgggat ggatggcaaa gggtcctacc aggtgaaagg 3900 ctggaaactt acacagcaac agtttgtggc ccttttgtgg aagagactgc taattgccag 3960 acggagtcgg aaaggatttt ttgctcagat tgtcttgcca gctgtgtttg tctgcattgc 4020 ccttgtgttc agcctgatcg tgccaccctt tggcaagtac cccagcctgg aacttcagcc 4080 ctggatgtac aacgaacagt acacatttgt cagcaatgat gctcctgagg acacgggaac 4140 cctggaactc ttaaacgccc tcaccaaaga ccctggcttc gggacccgct gtatggaagg 4200 aaacccaatc ccagacacgc cctgccaggc aggggaggaa gagtggacca ctgccccagt 4260 tccccagacc atcatggacc tcttccagaa tgggaactgg acaatgcaga acccttcacc 4320 tgcatgccag tgtagcagcg acaaaatcaa gaagatgctg cctgtgtgtc ccccaggggc 4380 aggggggctg cctcctccac aaagaaaaca aaacactgca gatatccttc aggacctgac 4440 aggaagaaac atttcggatt atctggtgaa gacgtatgtg cagatcatag ccaaaagctt 4500 aaagaacaag atctgggtga atgagtttag gtatggcggc ttttccctgg gtgtcagtaa 4560 tactcaagca cttcctccga gtcaagaagt taatgatgcc accaaacaaa tgaagaaaca 4620 cctaaagctg gccaaggaca gttctgcaga tcgatttctc aacagcttgg gaagatttat 4680 gacaggactg gacaccagaa ataatgtcaa ggtgtggttc aataacaagg gctggcatgc 4740 aatcagctct ttcctgaatg tcatcaacaa tgccattctc cgggccaacc tgcaaaaggg 4800 agagaaccct agccattatg gaattactgc tttcaatcat cccctgaatc tcaccaagca 4860 gcagctctca gaggtggctc cgatgaccac atcagtggat gtccttgtgt ccatctgtgt 4920 catctttgca atgtccttcg tcccagccag ctttgtcgta ttcctgatcc aggagcgggt 4980 cagcaaagca aaacacctgc agttcatcag tggagtgaag cctgtcatct actggctctc 5040 taattttgtc tgggatatgt gcaattacgt tgtccctgcc acactggtca ttatcatctt 5100 catctgcttc cagcagaagt cctatgtgtc ctccaccaat ctgcctgtgc tagcccttct 5160 acttttgctg tatgggtggt caatcacacc tctcatgtac ccagcctcct ttgtgttcaa 5220 gatccccagc acagcctatg tggtgctcac cagcgtgaac ctcttcattg gcattaatgg 5280 cagcgtggcc acctttgtgc tggagctgtt caccgacaat aagctgaata atatcaatga 5340 tatcctgaag tccgtgttct tgatcttccc acatttttgc ctgggacgag ggctcatcga 5400 catggtgaaa aaccaggcaa tggctgatgc cctggaaagg tttggggaga atcgctttgt 5460 gtcaccatta tcttgggact tggtgggacg aaacctcttc gccatggccg tggaaggggt 5520 ggtgttcttc ctcattactg ttctgatcca gtacagattc ttcatcaggc ccagacctgt 5580 aaatgcaaag ctatctcctc tgaatgatga agatgaagat gtgaggcggg aaagacagag 5640 aattcttgat ggtggaggcc agaatgacat cttagaaatc aaggagttga cgaagatata 5700 tagaaggaag cggaagcctg ctgttgacag gatttgcgtg ggcattcctc ctggtgagtg 5760 ctttgggctc ctgggagtta atggggctgg aaaatcatca actttcaaga tgttaacagg 5820 agataccact gttaccagag gagatgcttt ccttaacaga aatagtatct tatcaaacat 5880 ccatgaagta catcagaaca tgggctactg ccctcagttt gatgccatca cagagctgtt 5940 gactgggaga gaacacgtgg agttctttgc ccttttgaga ggagtcccag agaaagaagt 6000 tggcaaggtt ggtgagtggg cgattcggaa actgggcctc gtgaagtatg gagaaaaata 6060 tgctggtaac tatagtggag gcaacaaacg caagctctct acagccatgg ctttgatcgg 6120 cgggcctcct gtggtgtttc tggatgaacc caccacaggc atggatccca aagcccggcg 6180 gttcttgtgg aattgtgccc taagtgttgt caaggagggg agatcagtag tgcttacatc 6240 tcatagtatg gaagaatgtg aagctctttg cactaggatg gcaatcatgg tcaatggaag 6300 gttcaggtgc cttggcagtg tccagcatct aaaaaatagg tttggagatg gttatacaat 6360 agttgtacga atagcagggt ccaacccgga cctgaagcct gtccaggatt tctttggact 6420 tgcatttcct ggaagtgttc caaaagagaa acaccggaac atgctacaat accagcttcc 6480 atcttcatta tcttctctgg ccaggatatt cagcatcctc tcccagagca aaaagcgact 6540 ccacatagaa gactactctg tttctcagac aacacttgac caagtatttg tgaactttgc 6600 caaggaccaa agtgatgatg accacttaaa agacctctca ttacacaaaa accagacagt 6660 agtggacgtt gcagttctca catcttttct acaggatgag aaagtgaaag aaagctatgt 6720 atgaagaatc ctgttcatac ggggtggctg aaagtaaaga ggnactagac tttcctttgc 6780 accatgtgaa gtgttgtgga gaaaagagcc agaagttgat gtgggaagaa gtaaactgga 6840 tactgtactg atactattca atgcaatgca attcaatgca atgaaaacaa aattccatta 6900 caggggcagt gcctttgtag cctatgtctt gtatggctct caagtgaaag acttgaattt 6960 agttttttac ctatacctat gtgaaactct attatggaac ccaatggaca tatgggtttg 7020 aactcacact tttttttttt ttttgttcct gtgtattctc attggggttg caacaataat 7080 tcatcaagta atcatggcca gcgattattg atcaaaatca aaaggtaatg cacatcctca 7140 ttcactaagc catgccatgc ccaggagact ggtttcccgg tgacacatcc attgctggca 7200 atgagtgtgc cagagttatt agtgccaagt ttttcagaaa gtttgaagca ccatggtgtg 7260 tcatgctcac ttttgtgaaa gctgctctgc tcagagtcta tcaacattga atatcagttg 7320 acagaatggt gccatgcgtg gctaacatcc tgctttgatt ccctctgata agctgttctg 7380 gtggcagtaa catgcaacaa aaatgtgggt gtctctaggc acgggaaact tggttccatt 7440 gttatattgt cctatgcttc gagccatggg tctacagggt catccttatg agactcttaa 7500 atatacttag atcctggtaa gaggcaaaga atcaacagcc aaactgctgg ggctgcaagc 7560 tgctgaagcc agggcatggg attaaagaga ttgtgcgttc aaacctaggg aagcctgtgc 7620 ccatttgtcc tgactgtctg ctaacatggt acactgcatc tcaagatgtt tatctgacac 7680 aagtgtatta tttctggctt tttgaattaa tctagaaaat gaaaagatgg agttgtattt 7740 tgacaaaaat gtttgtactt tttaatgtta tttggaattt taagttctat cagtgacttc 7800 tgaatcctta gaatggcctc tttgtagaac cctgtggtat agaggagtat ggccactgcc 7860 ccactatttt tattttctta tgtaagtttg catatcagtc atgactagtg cctagaaagc 7920 aatgtgatgg tcaggatctc atgacattat atttgagttt ctttcagatc atttaggata 7980 ctcttaatct cacttcatca atcaaatatt ttttgagtgt atgctgtagc tgaaagagta 8040 tgtacgtacg tataagacta gagagatatt aagtctcagt acacttcctg tgccatgtta 8100 ttcagctcac tggtttacaa atataggttg tcttgtggtt gtaggagccc actgtaacaa 8160 tactgggcag cctttttttt tttttttaat tgcaacaatg caaaagccaa gaaagtataa 8220 gggtcacaag tctaaacaat gaattcttca acagggaaaa cagctagctt gaaaacttgc 8280 tgaaaaacac aacttgtgtt tatggcattt agtaccttca aataattggc tttgcagata 8340 ttggataccc cattaaatct gacagtctca aatttttcat ctcttcaatc actagtcaag 8400 aaaaatataa aaacaacaaa tacttccata tggagcattt ttcagagttt tctaacccag 8460 tcttattttt ctagtcagta aacatttgta aaaatactgt ttcactaata cttactgtta 8520 actgtcttga gagaaaagaa aaatatgaga gaactattgt ttggggaagt tcaagtgatc 8580 tttcaatatc attactaact tcttccactt tttccaaaat ttgaatatta acgctaaagg 8640 tgtaagactt cagatttcaa attaatcttt ctatattttt taaatttaca gaatattata 8700 taacccactg ctgaaaaaga aaaaaatgat tgttttagaa gttaaagtca atattgattt 8760 taaatataag taatgaaggc atatttccaa taactagtga tatggcatcg ttgcatttta 8820 cagtatcttc aaaaatacag aatttataga ataatttctc ctcatttaat atttttcaaa 8880 atcaaagtta tggtttcctc attttactaa aatcgtattc taattcttca ttatagtaaa 8940 tctatgagca actccttact tcggttcctc tgatttcaag gccatatttt aaaaaatcaa 9000 aaggcactgt gaactatttt gaagaaaaca caacatttta atacagattg aaaggacctc 9060 ttctgaagct agaaacaatc tatagttata catcttcatt aatactgtgt taccttttaa 9120 aatagtaatt ttttacattt tcctgtgtaa acctaattgt ggtagaaatt tttaccaact 9180 ctatactcaa tcaagcaaaa tttctgtata ttccctgtgg aatgtaccta tgtgagtttc 9240 agaaattctc aaaatacgtg ttcaaaaatt tctgcttttg catctttggg acacctcaga 9300 aaacttatta acaactgtga atatgagaaa tacagaagaa aataataagc cctctataca 9360 taaatgccca gcacaattca ttgttaaaaa acaaccaaac ctcacactac tgtatttcat 9420 tatctgtact gaaagcaaat gctttgtgac tattaaatgt tgcacatcat tcattcaaaa 9480 aaaaaaaaaa aaaaa 9495 97 41 DNA Homo sapiens 97 tgaaggctgt tcttctatca gtgtgtcaac ctgaacaagc t 41 98 41 DNA Homo sapiens 98 tgaaggctgt tcttctatca atgtgtcaac ctgaacaagc t 41 99 41 DNA Homo sapiens 99 agttacctgc aagccactgt ttttaaccag tttatactgt g 41 100 41 DNA Homo sapiens 100 agttacctgc aagccactgt atttaaccag tttatactgt g 41 101 41 DNA Homo sapiens 101 caggctcaga ggccttggcc catcaccctg gctcacgtgt g 41 102 41 DNA Homo sapiens 102 caggctcaga ggccttggcc tatcaccctg gctcacgtgt g 41 103 41 DNA Homo sapiens 103 atgggcctgg acaacagcat cctctggttt agctggttca t 41 104 41 DNA Homo sapiens 104 atgggcctgg acaacagcat actctggttt agctggttca t 41 105 41 DNA Homo sapiens 105 aagggaggag aagaagaaaa aaaatccaag cctctggtag a 41 106 41 DNA Homo sapiens 106 aagggaggag aagaagaaaa gaaatccaag cctctggtag a 41 107 41 DNA Homo sapiens 107 tcttcccttt gcagagacac gccctgccag gcaggggagg a 41 108 41 DNA Homo sapiens 108 tcttcccttt gcagagacac accctgccag gcaggggagg a 41 109 23 DNA Homo sapiens 109 ccttgcctcc tagtgtagga ttt 23 110 24 DNA Homo sapiens 110 aatataatag gtgctctgga cctc 24 111 25 DNA Homo sapiens 111 atacaaaaat agaaaaaggg gcttg 25 112 25 DNA Homo sapiens 112 atggatgaga aggaaagagg tttac 25 113 25 DNA Homo sapiens 113 cattggatca tacgtacatt tcaga 25 114 25 DNA Homo sapiens 114 tcactttccc caactataaa tggat 25 115 25 DNA Homo sapiens 115 gtagatcata caagtgagtg cttgg 25 116 25 DNA Homo sapiens 116 ctgttctcaa cttgctgctt ttatt 25 117 23 DNA Homo sapiens 117 gcaaattcaa atttctccag gta 23 118 23 DNA Homo sapiens 118 gcacaaagaa aggacatcag cta 23 119 23 DNA Homo sapiens 119 cagtgcttac ccctgctaat atc 23 120 23 DNA Homo sapiens 120 gagatggaga aatcattcac agc 23 121 23 DNA Homo sapiens 121 acatgtggaa tgacctaaac acc 23 122 23 DNA Homo sapiens 122 cttaggacat ttggccttgc tat 23 123 24 DNA Homo sapiens 123 catttctgtt ttaagagcct gtca 24 124 23 DNA Homo sapiens 124 aatgtggcat gcagttgata aat 23 125 23 DNA Homo sapiens 125 gtttgtggtt gttacggaat gat 23 126 23 DNA Homo sapiens 126 cctcccaaca tgatatctca ctc 23 127 23 DNA Homo sapiens 127 gtctgggacc tgtagtcagg ttt 23 128 23 DNA Homo sapiens 128 ccaatagaca gaatcaggcc ata 23 129 23 DNA Homo sapiens 129 tgccaacatt tattagagga agc 23 130 23 DNA Homo sapiens 130 atccgtttaa cctgccaact act 23 131 23 DNA Homo sapiens 131 tctcaggagc cgtttattta atg 23 132 23 DNA Homo sapiens 132 gccaacttta ccatgagttg aaa 23 133 23 DNA Homo sapiens 133 tctgatcata gtgttttgcc ttg 23 134 23 DNA Homo sapiens 134 tgttccccta caatgagatt cac 23 135 22 DNA Homo sapiens 135 gggtgaacag atgtttttcc tt 22 136 23 DNA Homo sapiens 136 tagctggaac atttcctgat gat 23 137 23 DNA Homo sapiens 137 ccctttcttg tctgataatg gtg 23 138 23 DNA Homo sapiens 138 cacaattaaa cactgtcctc tgg 23 139 2201 PRT Homo sapiens 139 Met Pro Ser Ala Gly Thr Leu Pro Trp Val Gln Gly Ile Ile Cys Asn 1 5 10 15 Ala Asn Asn Pro Cys Phe Arg Tyr Pro Thr Pro Gly Glu Ala Pro Gly 20 25 30 Val Val Gly Asn Phe Asn Lys Ser Ile Val Ala Arg Leu Phe Ser Asp 35 40 45 Ala Arg Arg Leu Leu Leu Tyr Ser Gln Lys Asp Thr Ser Met Lys Asp 50 55 60 Met Arg Lys Val Leu Arg Thr Leu Gln Gln Ile Lys Lys Ser Ser Ser 65 70 75 80 Asn Leu Lys Leu Gln Asp Phe Leu Val Asp Asn Glu Thr Phe Ser Gly 85 90 95 Phe Leu Tyr His Asn Leu Ser Leu Pro Lys Ser Thr Val Asp Lys Met 100 105 110 Leu Arg Ala Asp Val Ile Leu His Lys Val Phe Leu Gln Gly Tyr Gln 115 120 125 Leu His Leu Thr Ser Leu Cys Asn Gly Ser Lys Ser Glu Glu Met Ile 130 135 140 Gln Leu Gly Asp Gln Glu Val Ser Glu Leu Cys Gly Leu Pro Arg Glu 145 150 155 160 Lys Leu Ala Ala Ala Glu Arg Val Leu Arg Ser Asn Met Asp Ile Leu 165 170 175 Lys Pro Ile Leu Arg Thr Leu Asn Ser Thr Ser Pro Phe Pro Ser Lys 180 185 190 Glu Leu Ala Glu Ala Thr Lys Thr Leu Leu His Ser Leu Gly Thr Leu 195 200 205 Ala Gln Glu Leu Phe Ser Met Arg Ser Trp Ser Asp Met Arg Gln Glu 210 215 220 Val Met Phe Leu Thr Asn Val Asn Ser Ser Ser Ser Ser Thr Gln Ile 225 230 235 240 Tyr Gln Ala Val Ser Arg Ile Val Cys Gly His Pro Glu Gly Gly Gly 245 250 255 Leu Lys Ile Lys Ser Leu Asn Trp Tyr Glu Asp Asn Asn Tyr Lys Ala 260 265 270 Leu Phe Gly Gly Asn Gly Thr Glu Glu Asp Ala Glu Thr Phe Tyr Asp 275 280 285 Asn Ser Thr Thr Pro Tyr Cys Asn Asp Leu Met Lys Asn Leu Glu Ser 290 295 300 Ser Pro Leu Ser Arg Ile Ile Trp Lys Ala Leu Lys Pro Leu Leu Val 305 310 315 320 Gly Lys Ile Leu Tyr Thr Pro Asp Thr Pro Ala Thr Arg Gln Val Met 325 330 335 Ala Glu Val Asn Lys Thr Phe Gln Glu Leu Ala Val Phe His Asp Leu 340 345 350 Glu Gly Met Trp Glu Glu Leu Ser Pro Lys Ile Trp Thr Phe Met Glu 355 360 365 Asn Ser Gln Glu Met Asp Leu Val Arg Met Leu Leu Asp Ser Arg Asp 370 375 380 Asn Asp His Phe Trp Glu Gln Gln Leu Asp Gly Leu Asp Trp Thr Ala 385 390 395 400 Gln Asp Ile Val Ala Phe Leu Ala Lys His Pro Glu Asp Val Gln Ser 405 410 415 Ser Asn Gly Ser Val Tyr Thr Trp Arg Glu Ala Phe Asn Glu Thr Asn 420 425 430 Gln Ala Ile Arg Thr Ile Ser Arg Phe Met Glu Cys Val Asn Leu Asn 435 440 445 Lys Leu Glu Pro Ile Ala Thr Glu Val Trp Leu Ile Asn Lys Ser Met 450 455 460 Glu Leu Leu Asp Glu Arg Lys Phe Trp Ala Gly Ile Val Phe Thr Gly 465 470 475 480 Ile Thr Pro Gly Ser Ile Glu Leu Pro His His Val Lys Tyr Lys Ile 485 490 495 Arg Met Asp Ile Asp Asn Val Glu Arg Thr Asn Lys Ile Lys Asp Gly 500 505 510 Tyr Trp Asp Pro Gly Pro Arg Ala Asp Pro Phe Glu Asp Met Arg Tyr 515 520 525 Val Trp Gly Gly Phe Ala Tyr Leu Gln Asp Val Val Glu Gln Ala Ile 530 535 540 Ile Arg Val Leu Thr Gly Thr Glu Lys Lys Thr Gly Val Tyr Met Gln 545 550 555 560 Gln Met Pro Tyr Pro Cys Tyr Val Asp Asp Ile Phe Leu Arg Val Met 565 570 575 Ser Arg Ser Met Pro Leu Phe Met Thr Leu Ala Trp Ile Tyr Ser Val 580 585 590 Ala Val Ile Ile Lys Gly Ile Val Tyr Glu Lys Glu Ala Arg Leu Lys 595 600 605 Glu Thr Met Arg Ile Met Gly Leu Asp Asn Ser Ile Leu Trp Phe Ser 610 615 620 Trp Phe Ile Ser Ser Leu Ile Pro Leu Leu Val Ser Ala Gly Leu Leu 625 630 635 640 Val Val Ile Leu Lys Leu Gly Asn Leu Leu Pro Tyr Ser Asp Pro Ser 645 650 655 Val Val Phe Val Phe Leu Ser Val Phe Ala Val Val Thr Ile Leu Gln 660 665 670 Cys Phe Leu Ile Ser Thr Leu Phe Ser Arg Ala Asn Leu Ala Ala Ala 675 680 685 Cys Gly Gly Ile Ile Tyr Phe Thr Leu Tyr Leu Pro Tyr Val Leu Cys 690 695 700 Val Ala Trp Gln Asp Tyr Val Gly Phe Thr Leu Lys Ile Phe Ala Ser 705 710 715 720 Leu Leu Ser Pro Val Ala Phe Gly Phe Gly Cys Glu Tyr Phe Ala Leu 725 730 735 Phe Glu Glu Gln Gly Ile Gly Val Gln Trp Asp Asn Leu Phe Glu Ser 740 745 750 Pro Val Glu Glu Asp Gly Phe Asn Leu Thr Thr Ser Val Ser Met Met 755 760 765 Leu Phe Asp Thr Phe Leu Tyr Gly Val Met Thr Trp Tyr Ile Glu Ala 770 775 780 Val Phe Pro Gly Gln Tyr Gly Ile Pro Arg Pro Trp Tyr Phe Pro Cys 785 790 795 800 Thr Lys Ser Tyr Trp Phe Gly Glu Glu Ser Asp Glu Lys Ser His Pro 805 810 815 Gly Ser Asn Gln Lys Arg Ile Ser Glu Ile Cys Met Glu Glu Glu Pro 820 825 830 Thr His Leu Lys Leu Gly Val Ser Ile Gln Asn Leu Val Lys Val Tyr 835 840 845 Arg Asp Gly Met Lys Val Ala Val Asp Gly Leu Ala Leu Asn Phe Tyr 850 855 860 Glu Gly Gln Ile Thr Ser Phe Leu Gly His Asn Gly Ala Gly Lys Thr 865 870 875 880 Thr Thr Met Ser Ile Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Thr 885 890 895 Ala Tyr Ile Leu Gly Lys Asp Ile Arg Ser Glu Met Ser Thr Ile Arg 900 905 910 Gln Asn Leu Gly Val Cys Pro Gln His Asn Val Leu Phe Asp Met Leu 915 920 925 Thr Val Glu Glu His Ile Trp Phe Tyr Ala Arg Leu Lys Gly Leu Ser 930 935 940 Glu Lys His Val Lys Ala Glu Met Glu Gln Met Ala Leu Asp Val Gly 945 950 955 960 Leu Pro Ser Ser Lys Leu Lys Ser Lys Thr Ser Gln Leu Ser Gly Gly 965 970 975 Met Gln Arg Lys Leu Ser Val Ala Leu Ala Phe Val Gly Gly Ser Lys 980 985 990 Val Val Ile Leu Asp Glu Pro Thr Ala Gly Val Asp Pro Tyr Ser Arg 995 1000 1005 Arg Gly Ile Trp Glu Leu Leu Leu Lys Tyr Arg Gln Gly Arg Thr 1010 1015 1020 Ile Ile Leu Ser Thr His His Met Asp Glu Ala Asp Val Leu Gly 1025 1030 1035 Asp Arg Ile Ala Ile Ile Ser His Gly Lys Leu Cys Cys Val Gly 1040 1045 1050 Ser Ser Leu Phe Leu Lys Asn Gln Leu Gly Thr Gly Tyr Tyr Leu 1055 1060 1065 Thr Leu Val Lys Lys Asp Val Glu Ser Ser Leu Ser Ser Cys Arg 1070 1075 1080 Asn Ser Ser Ser Thr Val Ser Tyr Leu Lys Lys Glu Asp Ser Val 1085 1090 1095 Ser Gln Ser Ser Ser Asp Ala Gly Leu Gly Ser Asp His Glu Ser 1100 1105 1110 Asp Thr Leu Thr Ile Asp Val Ser Ala Ile Ser Asn Leu Ile Arg 1115 1120 1125 Lys His Val Ser Glu Ala Arg Leu Val Glu Asp Ile Gly His Glu 1130 1135 1140 Leu Thr Tyr Val Leu Pro Tyr Glu Ala Ala Lys Glu Gly Ala Phe 1145 1150 1155 Val Glu Leu Phe His Glu Ile Asp Asp Arg Leu Ser Asp Leu Gly 1160 1165 1170 Ile Ser Ser Tyr Gly Ile Ser Glu Thr Thr Leu Glu Glu Ile Phe 1175 1180 1185 Leu Lys Val Ala Glu Glu Ser Gly Val Asp Ala Glu Thr Ser Asp 1190 1195 1200 Gly Thr Leu Pro Ala Arg Arg Asn Arg Arg Ala Phe Gly Asp Lys 1205 1210 1215 Gln Ser Cys Leu Arg Pro Phe Thr Glu Asp Asp Ala Ala Asp Pro 1220 1225 1230 Asn Asp Ser Asp Ile Asp Pro Glu Ser Arg Glu Thr Asp Leu Leu 1235 1240 1245 Ser Gly Met Asp Gly Lys Gly Ser Tyr Gln Val Lys Gly Trp Lys 1250 1255 1260 Leu Thr Gln Gln Gln Phe Val Ala Leu Leu Trp Lys Arg Leu Leu 1265 1270 1275 Ile Ala Arg Arg Ser Arg Lys Gly Phe Phe Ala Gln Ile Val Leu 1280 1285 1290 Pro Ala Val Phe Val Cys Ile Ala Leu Val Phe Ser Leu Ile Val 1295 1300 1305 Pro Pro Phe Gly Lys Tyr Pro Ser Leu Glu Leu Gln Pro Trp Met 1310 1315 1320 Tyr Asn Glu Gln Tyr Thr Phe Val Ser Asn Asp Ala Pro Glu Asp 1325 1330 1335 Thr Gly Thr Leu Glu Leu Leu Asn Ala Leu Thr Lys Asp Pro Gly 1340 1345 1350 Phe Gly Thr Arg Cys Met Glu Gly Asn Pro Ile Pro Asp Thr Pro 1355 1360 1365 Cys Gln Ala Gly Glu Glu Glu Trp Thr Thr Ala Pro Val Pro Gln 1370 1375 1380 Thr Ile Met Asp Leu Phe Gln Asn Gly Asn Trp Thr Met Gln Asn 1385 1390 1395 Pro Ser Pro Ala Cys Gln Cys Ser Ser Asp Lys Ile Lys Lys Met 1400 1405 1410 Leu Pro Val Cys Pro Pro Gly Ala Gly Gly Leu Pro Pro Pro Gln 1415 1420 1425 Arg Lys Gln Asn Thr Ala Asp Ile Leu Gln Asp Leu Thr Gly Arg 1430 1435 1440 Asn Ile Ser Asp Tyr Leu Val Lys Thr Tyr Val Gln Ile Ile Ala 1445 1450 1455 Lys Ser Leu Lys Asn Lys Ile Trp Val Asn Glu Phe Arg Tyr Gly 1460 1465 1470 Gly Phe Ser Leu Gly Val Ser Asn Thr Gln Ala Leu Pro Pro Ser 1475 1480 1485 Gln Glu Val Asn Asp Ala Thr Lys Gln Met Lys Lys His Leu Lys 1490 1495 1500 Leu Ala Lys Asp Ser Ser Ala Asp Arg Phe Leu Asn Ser Leu Gly 1505 1510 1515 Arg Phe Met Thr Gly Leu Asp Thr Arg Asn Asn Val Lys Val Trp 1520 1525 1530 Phe Asn Asn Lys Gly Trp His Ala Ile Ser Ser Phe Leu Asn Val 1535 1540 1545 Ile Asn Asn Ala Ile Leu Arg Ala Asn Leu Gln Lys Gly Glu Asn 1550 1555 1560 Pro Ser His Tyr Gly Ile Thr Ala Phe Asn His Pro Leu Asn Leu 1565 1570 1575 Thr Lys Gln Gln Leu Ser Glu Val Ala Pro Met Thr Thr Ser Val 1580 1585 1590 Asp Val Leu Val Ser Ile Cys Val Ile Phe Ala Met Ser Phe Val 1595 1600 1605 Pro Ala Ser Phe Val Val Phe Leu Ile Gln Glu Arg Val Ser Lys 1610 1615 1620 Ala Lys His Leu Gln Phe Ile Ser Gly Val Lys Pro Val Ile Tyr 1625 1630 1635 Trp Leu Ser Asn Phe Val Trp Asp Met Cys Asn Tyr Val Val Pro 1640 1645 1650 Ala Thr Leu Val Ile Ile Ile Phe Ile Cys Phe Gln Gln Lys Ser 1655 1660 1665 Tyr Val Ser Ser Thr Asn Leu Pro Val Leu Ala Leu Leu Leu Leu 1670 1675 1680 Leu Tyr Gly Trp Ser Ile Thr Pro Leu Met Tyr Pro Ala Ser Phe 1685 1690 1695 Val Phe Lys Ile Pro Ser Thr Ala Tyr Val Val Leu Thr Ser Val 1700 1705 1710 Asn Leu Phe Ile Gly Ile Asn Gly Ser Val Ala Thr Phe Val Leu 1715 1720 1725 Glu Leu Phe Thr Asp Asn Lys Leu Asn Asn Ile Asn Asp Ile Leu 1730 1735 1740 Lys Ser Val Phe Leu Ile Phe Pro His Phe Cys Leu Gly Arg Gly 1745 1750 1755 Leu Ile Asp Met Val Lys Asn Gln Ala Met Ala Asp Ala Leu Glu 1760 1765 1770 Arg Phe Gly Glu Asn Arg Phe Val Ser Pro Leu Ser Trp Asp Leu 1775 1780 1785 Val Gly Arg Asn Leu Phe Ala Met Ala Val Glu Gly Val Val Phe 1790 1795 1800 Phe Leu Ile Thr Val Leu Ile Gln Tyr Arg Phe Phe Ile Arg Pro 1805 1810 1815 Arg Pro Val Asn Ala Lys Leu Ser Pro Leu Asn Asp Glu Asp Glu 1820 1825 1830 Asp Val Arg Arg Glu Arg Gln Arg Ile Leu Asp Gly Gly Gly Gln 1835 1840 1845 Asn Asp Ile Leu Glu Ile Lys Glu Leu Thr Lys Ile Tyr Arg Arg 1850 1855 1860 Lys Arg Lys Pro Ala Val Asp Arg Ile Cys Val Gly Ile Pro Pro 1865 1870 1875 Gly Glu Cys Phe Gly Leu Leu Gly Val Asn Gly Ala Gly Lys Ser 1880 1885 1890 Ser Thr Phe Lys Met Leu Thr Gly Asp Thr Thr Val Thr Arg Gly 1895 1900 1905 Asp Ala Phe Leu Asn Arg Asn Ser Ile Leu Ser Asn Ile His Glu 1910 1915 1920 Val His Gln Asn Met Gly Tyr Cys Pro Gln Phe Asp Ala Ile Thr 1925 1930 1935 Glu Leu Leu Thr Gly Arg Glu His Val Glu Phe Phe Ala Leu Leu 1940 1945 1950 Arg Gly Val Pro Glu Lys Glu Val Gly Lys Val Gly Glu Trp Ala 1955 1960 1965 Ile Arg Lys Leu Gly Leu Val Lys Tyr Gly Glu Lys Tyr Ala Gly 1970 1975 1980 Asn Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Ala Met Ala 1985 1990 1995 Leu Ile Gly Gly Pro Pro Val Val Phe Leu Asp Glu Pro Thr Thr 2000 2005 2010 Gly Met Asp Pro Lys Ala Arg Arg Phe Leu Trp Asn Cys Ala Leu 2015 2020 2025 Ser Val Val Lys Glu Gly Arg Ser Val Val Leu Thr Ser His Ser 2030 2035 2040 Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Met Ala Ile Met Val 2045 2050 2055 Asn Gly Arg Phe Arg Cys Leu Gly Ser Val Gln His Leu Lys Asn 2060 2065 2070 Arg Phe Gly Asp Gly Tyr Thr Ile Val Val Arg Ile Ala Gly Ser 2075 2080 2085 Asn Pro Asp Leu Lys Pro Val Gln Asp Phe Phe Gly Leu Ala Phe 2090 2095 2100 Pro Gly Ser Val Pro Lys Glu Lys His Arg Asn Met Leu Gln Tyr 2105 2110 2115 Gln Leu Pro Ser Ser Leu Ser Ser Leu Ala Arg Ile Phe Ser Ile 2120 2125 2130 Leu Ser Gln Ser Lys Lys Arg Leu His Ile Glu Asp Tyr Ser Val 2135 2140 2145 Ser Gln Thr Thr Leu Asp Gln Val Phe Val Asn Phe Ala Lys Asp 2150 2155 2160 Gln Ser Asp Asp Asp His Leu Lys Asp Leu Ser Leu His Lys Asn 2165 2170 2175 Gln Thr Val Val Asp Val Ala Val Leu Thr Ser Phe Leu Gln Asp 2180 2185 2190 Glu Lys Val Lys Glu Ser Tyr Val 2195 2200 140 2233 PRT Homo sapiens 140 Met Pro Ser Ala Gly Thr Leu Pro Trp Val Gln Gly Ile Ile Cys Asn 1 5 10 15 Ala Asn Asn Pro Cys Phe Arg Tyr Pro Thr Pro Gly Glu Ala Pro Gly 20 25 30 Val Val Gly Asn Phe Asn Lys Ser Ile Val Ala Arg Leu Phe Ser Asp 35 40 45 Ala Arg Arg Leu Leu Leu Tyr Ser Gln Lys Asp Thr Ser Met Lys Asp 50 55 60 Met Arg Lys Val Leu Arg Thr Leu Gln Gln Ile Lys Lys Ser Ser Ser 65 70 75 80 Asn Leu Lys Leu Gln Asp Phe Leu Val Asp Asn Glu Thr Phe Ser Gly 85 90 95 Phe Leu Tyr His Asn Leu Ser Leu Pro Lys Ser Thr Val Asp Lys Met 100 105 110 Leu Arg Ala Asp Val Ile Leu His Lys Val Phe Leu Gln Gly Tyr Gln 115 120 125 Leu His Leu Thr Ser Leu Cys Asn Gly Ser Lys Ser Glu Glu Met Ile 130 135 140 Gln Leu Gly Asp Gln Glu Val Ser Glu Leu Cys Gly Leu Pro Arg Glu 145 150 155 160 Lys Leu Ala Ala Ala Glu Arg Val Leu Arg Ser Asn Met Asp Ile Leu 165 170 175 Lys Pro Ile Leu Arg Thr Leu Asn Ser Thr Ser Pro Phe Pro Ser Lys 180 185 190 Glu Leu Ala Glu Ala Thr Lys Thr Leu Leu His Ser Leu Gly Thr Leu 195 200 205 Ala Gln Glu Leu Phe Ser Met Arg Ser Trp Ser Asp Met Arg Gln Glu 210 215 220 Val Met Phe Leu Thr Asn Val Asn Ser Ser Ser Ser Ser Thr Gln Ile 225 230 235 240 Tyr Gln Ala Val Ser Arg Ile Val Cys Gly His Pro Glu Gly Gly Gly 245 250 255 Leu Lys Ile Lys Ser Leu Asn Trp Tyr Glu Asp Asn Asn Tyr Lys Ala 260 265 270 Leu Phe Gly Gly Asn Gly Thr Glu Glu Asp Ala Glu Thr Phe Tyr Asp 275 280 285 Asn Ser Thr Thr Pro Tyr Cys Asn Asp Leu Met Lys Asn Leu Glu Ser 290 295 300 Ser Pro Leu Ser Arg Ile Ile Trp Lys Ala Leu Lys Pro Leu Leu Val 305 310 315 320 Gly Lys Ile Leu Tyr Thr Pro Asp Thr Pro Ala Thr Arg Gln Val Met 325 330 335 Ala Glu Val Asn Lys Thr Phe Gln Glu Leu Ala Val Phe His Asp Leu 340 345 350 Glu Gly Met Trp Glu Glu Leu Ser Pro Lys Ile Trp Thr Phe Met Glu 355 360 365 Asn Ser Gln Glu Met Asp Leu Val Arg Met Leu Leu Asp Ser Arg Asp 370 375 380 Asn Asp His Phe Trp Glu Gln Gln Leu Asp Gly Leu Asp Trp Thr Ala 385 390 395 400 Gln Asp Ile Val Ala Phe Leu Ala Lys His Pro Glu Asp Val Gln Ser 405 410 415 Ser Asn Gly Ser Val Tyr Thr Trp Arg Glu Ala Phe Asn Glu Thr Asn 420 425 430 Gln Ala Ile Arg Thr Ile Ser Arg Phe Met Glu Cys Val Asn Leu Asn 435 440 445 Lys Leu Glu Pro Ile Ala Thr Glu Val Trp Leu Ile Asn Lys Ser Met 450 455 460 Glu Leu Leu Glu Tyr Ser Gly Val Thr Ser Ala His Cys Asn Leu Cys 465 470 475 480 Leu Leu Ser Ser Ser Asp Ser Arg Ala Ser Ala Ser Gln Val Ala Gly 485 490 495 Ile Thr Ala Pro Ala Thr Thr Pro Gly Ala Gly Ile Val Phe Thr Gly 500 505 510 Ile Thr Pro Gly Ser Ile Glu Leu Pro His His Val Lys Tyr Lys Ile 515 520 525 Arg Met Asp Ile Asp Asn Val Glu Arg Thr Asn Lys Ile Lys Asp Gly 530 535 540 Tyr Trp Asp Pro Gly Pro Arg Ala Asp Pro Phe Glu Asp Met Arg Tyr 545 550 555 560 Val Trp Gly Gly Phe Ala Tyr Leu Gln Asp Val Val Glu Gln Ala Ile 565 570 575 Ile Arg Val Leu Thr Gly Thr Glu Lys Lys Thr Gly Val Tyr Met Gln 580 585 590 Gln Met Pro Tyr Pro Cys Tyr Val Asp Asp Ile Phe Leu Arg Val Met 595 600 605 Ser Arg Ser Met Pro Leu Phe Met Thr Leu Ala Trp Ile Tyr Ser Val 610 615 620 Ala Val Ile Ile Lys Gly Ile Val Tyr Glu Lys Glu Ala Arg Leu Lys 625 630 635 640 Glu Thr Met Arg Ile Met Gly Leu Asp Asn Ser Ile Leu Trp Phe Ser 645 650 655 Trp Phe Ile Ser Ser Leu Ile Pro Leu Leu Val Ser Ala Gly Leu Leu 660 665 670 Val Val Ile Leu Lys Leu Gly Asn Leu Leu Pro Tyr Ser Asp Pro Ser 675 680 685 Val Val Phe Val Phe Leu Ser Val Phe Ala Val Val Thr Ile Leu Gln 690 695 700 Cys Phe Leu Ile Ser Thr Leu Phe Ser Arg Ala Asn Leu Ala Ala Ala 705 710 715 720 Cys Gly Gly Ile Ile Tyr Phe Thr Leu Tyr Leu Pro Tyr Val Leu Cys 725 730 735 Val Ala Trp Gln Asp Tyr Val Gly Phe Thr Leu Lys Ile Phe Ala Ser 740 745 750 Leu Leu Ser Pro Val Ala Phe Gly Phe Gly Cys Glu Tyr Phe Ala Leu 755 760 765 Phe Glu Glu Gln Gly Ile Gly Val Gln Trp Asp Asn Leu Phe Glu Ser 770 775 780 Pro Val Glu Glu Asp Gly Phe Asn Leu Thr Thr Ser Val Ser Met Met 785 790 795 800 Leu Phe Asp Thr Phe Leu Tyr Gly Val Met Thr Trp Tyr Ile Glu Ala 805 810 815 Val Phe Pro Gly Gln Tyr Gly Ile Pro Arg Pro Trp Tyr Phe Pro Cys 820 825 830 Thr Lys Ser Tyr Trp Phe Gly Glu Glu Ser Asp Glu Lys Ser His Pro 835 840 845 Gly Ser Asn Gln Lys Arg Ile Ser Glu Ile Cys Met Glu Glu Glu Pro 850 855 860 Thr His Leu Lys Leu Gly Val Ser Ile Gln Asn Leu Val Lys Val Tyr 865 870 875 880 Arg Asp Gly Met Lys Val Ala Val Asp Gly Leu Ala Leu Asn Phe Tyr 885 890 895 Glu Gly Gln Ile Thr Ser Phe Leu Gly His Asn Gly Ala Gly Lys Thr 900 905 910 Thr Thr Met Ser Ile Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Thr 915 920 925 Ala Tyr Ile Leu Gly Lys Asp Ile Arg Ser Glu Met Ser Thr Ile Arg 930 935 940 Gln Asn Leu Gly Val Cys Pro Gln His Asn Val Leu Phe Asp Met Leu 945 950 955 960 Thr Val Glu Glu His Ile Trp Phe Tyr Ala Arg Leu Lys Gly Leu Ser 965 970 975 Glu Lys His Val Lys Ala Glu Met Glu Gln Met Ala Leu Asp Val Gly 980 985 990 Leu Pro Ser Ser Lys Leu Lys Ser Lys Thr Ser Gln Leu Ser Gly Gly 995 1000 1005 Met Gln Arg Lys Leu Ser Val Ala Leu Ala Phe Val Gly Gly Ser 1010 1015 1020 Lys Val Val Ile Leu Asp Glu Pro Thr Ala Gly Val Asp Pro Tyr 1025 1030 1035 Ser Arg Arg Gly Ile Trp Glu Leu Leu Leu Lys Tyr Arg Gln Gly 1040 1045 1050 Arg Thr Ile Ile Leu Ser Thr His His Met Asp Glu Ala Asp Val 1055 1060 1065 Leu Gly Asp Arg Ile Ala Ile Ile Ser His Gly Lys Leu Cys Cys 1070 1075 1080 Val Gly Ser Ser Leu Phe Leu Lys Asn Gln Leu Gly Thr Gly Tyr 1085 1090 1095 Tyr Leu Thr Leu Val Lys Lys Asp Val Glu Ser Ser Leu Ser Ser 1100 1105 1110 Cys Arg Asn Ser Ser Ser Thr Val Ser Tyr Leu Lys Lys Glu Asp 1115 1120 1125 Ser Val Ser Gln Ser Ser Ser Asp Ala Gly Leu Gly Ser Asp His 1130 1135 1140 Glu Ser Asp Thr Leu Thr Ile Asp Val Ser Ala Ile Ser Asn Leu 1145 1150 1155 Ile Arg Lys His Val Ser Glu Ala Arg Leu Val Glu Asp Ile Gly 1160 1165 1170 His Glu Leu Thr Tyr Val Leu Pro Tyr Glu Ala Ala Lys Glu Gly 1175 1180 1185 Ala Phe Val Glu Leu Phe His Glu Ile Asp Asp Arg Leu Ser Asp 1190 1195 1200 Leu Gly Ile Ser Ser Tyr Gly Ile Ser Glu Thr Thr Leu Glu Glu 1205 1210 1215 Ile Phe Leu Lys Val Ala Glu Glu Ser Gly Val Asp Ala Glu Thr 1220 1225 1230 Ser Asp Gly Thr Leu Pro Ala Arg Arg Asn Arg Arg Ala Phe Gly 1235 1240 1245 Asp Lys Gln Ser Cys Leu Arg Pro Phe Thr Glu Asp Asp Ala Ala 1250 1255 1260 Asp Pro Asn Asp Ser Asp Ile Asp Pro Glu Ser Arg Glu Thr Asp 1265 1270 1275 Leu Leu Ser Gly Met Asp Gly Lys Gly Ser Tyr Gln Val Lys Gly 1280 1285 1290 Trp Lys Leu Thr Gln Gln Gln Phe Val Ala Leu Leu Trp Lys Arg 1295 1300 1305 Leu Leu Ile Ala Arg Arg Ser Arg Lys Gly Phe Phe Ala Gln Ile 1310 1315 1320 Val Leu Pro Ala Val Phe Val Cys Ile Ala Leu Val Phe Ser Leu 1325 1330 1335 Ile Val Pro Pro Phe Gly Lys Tyr Pro Ser Leu Glu Leu Gln Pro 1340 1345 1350 Trp Met Tyr Asn Glu Gln Tyr Thr Phe Val Ser Asn Asp Ala Pro 1355 1360 1365 Glu Asp Thr Gly Thr Leu Glu Leu Leu Asn Ala Leu Thr Lys Asp 1370 1375 1380 Pro Gly Phe Gly Thr Arg Cys Met Glu Gly Asn Pro Ile Pro Asp 1385 1390 1395 Thr Pro Cys Gln Ala Gly Glu Glu Glu Trp Thr Thr Ala Pro Val 1400 1405 1410 Pro Gln Thr Ile Met Asp Leu Phe Gln Asn Gly Asn Trp Thr Met 1415 1420 1425 Gln Asn Pro Ser Pro Ala Cys Gln Cys Ser Ser Asp Lys Ile Lys 1430 1435 1440 Lys Met Leu Pro Val Cys Pro Pro Gly Ala Gly Gly Leu Pro Pro 1445 1450 1455 Pro Gln Arg Lys Gln Asn Thr Ala Asp Ile Leu Gln Asp Leu Thr 1460 1465 1470 Gly Arg Asn Ile Ser Asp Tyr Leu Val Lys Thr Tyr Val Gln Ile 1475 1480 1485 Ile Ala Lys Ser Leu Lys Asn Lys Ile Trp Val Asn Glu Phe Arg 1490 1495 1500 Tyr Gly Gly Phe Ser Leu Gly Val Ser Asn Thr Gln Ala Leu Pro 1505 1510 1515 Pro Ser Gln Glu Val Asn Asp Ala Thr Lys Gln Met Lys Lys His 1520 1525 1530 Leu Lys Leu Ala Lys Asp Ser Ser Ala Asp Arg Phe Leu Asn Ser 1535 1540 1545 Leu Gly Arg Phe Met Thr Gly Leu Asp Thr Arg Asn Asn Val Lys 1550 1555 1560 Val Trp Phe Asn Asn Lys Gly Trp His Ala Ile Ser Ser Phe Leu 1565 1570 1575 Asn Val Ile Asn Asn Ala Ile Leu Arg Ala Asn Leu Gln Lys Gly 1580 1585 1590 Glu Asn Pro Ser His Tyr Gly Ile Thr Ala Phe Asn His Pro Leu 1595 1600 1605 Asn Leu Thr Lys Gln Gln Leu Ser Glu Val Ala Pro Met Thr Thr 1610 1615 1620 Ser Val Asp Val Leu Val Ser Ile Cys Val Ile Phe Ala Met Ser 1625 1630 1635 Phe Val Pro Ala Ser Phe Val Val Phe Leu Ile Gln Glu Arg Val 1640 1645 1650 Ser Lys Ala Lys His Leu Gln Phe Ile Ser Gly Val Lys Pro Val 1655 1660 1665 Ile Tyr Trp Leu Ser Asn Phe Val Trp Asp Met Cys Asn Tyr Val 1670 1675 1680 Val Pro Ala Thr Leu Val Ile Ile Ile Phe Ile Cys Phe Gln Gln 1685 1690 1695 Lys Ser Tyr Val Ser Ser Thr Asn Leu Pro Val Leu Ala Leu Leu 1700 1705 1710 Leu Leu Leu Tyr Gly Trp Ser Ile Thr Pro Leu Met Tyr Pro Ala 1715 1720 1725 Ser Phe Val Phe Lys Ile Pro Ser Thr Ala Tyr Val Val Leu Thr 1730 1735 1740 Ser Val Asn Leu Phe Ile Gly Ile Asn Gly Ser Val Ala Thr Phe 1745 1750 1755 Val Leu Glu Leu Phe Thr Asp Asn Lys Leu Asn Asn Ile Asn Asp 1760 1765 1770 Ile Leu Lys Ser Val Phe Leu Ile Phe Pro His Phe Cys Leu Gly 1775 1780 1785 Arg Gly Leu Ile Asp Met Val Lys Asn Gln Ala Met Ala Asp Ala 1790 1795 1800 Leu Glu Arg Phe Gly Glu Asn Arg Phe Val Ser Pro Leu Ser Trp 1805 1810 1815 Asp Leu Val Gly Arg Asn Leu Phe Ala Met Ala Val Glu Gly Val 1820 1825 1830 Val Phe Phe Leu Ile Thr Val Leu Ile Gln Tyr Arg Phe Phe Ile 1835 1840 1845 Arg Pro Arg Pro Val Asn Ala Lys Leu Ser Pro Leu Asn Asp Glu 1850 1855 1860 Asp Glu Asp Val Arg Arg Glu Arg Gln Arg Ile Leu Asp Gly Gly 1865 1870 1875 Gly Gln Asn Asp Ile Leu Glu Ile Lys Glu Leu Thr Lys Ile Tyr 1880 1885 1890 Arg Arg Lys Arg Lys Pro Ala Val Asp Arg Ile Cys Val Gly Ile 1895 1900 1905 Pro Pro Gly Glu Cys Phe Gly Leu Leu Gly Val Asn Gly Ala Gly 1910 1915 1920 Lys Ser Ser Thr Phe Lys Met Leu Thr Gly Asp Thr Thr Val Thr 1925 1930 1935 Arg Gly Asp Ala Phe Leu Asn Arg Asn Ser Ile Leu Ser Asn Ile 1940 1945 1950 His Glu Val His Gln Asn Met Gly Tyr Cys Pro Gln Phe Asp Ala 1955 1960 1965 Ile Thr Glu Leu Leu Thr Gly Arg Glu His Val Glu Phe Phe Ala 1970 1975 1980 Leu Leu Arg Gly Val Pro Glu Lys Glu Val Gly Lys Val Gly Glu 1985 1990 1995 Trp Ala Ile Arg Lys Leu Gly Leu Val Lys Tyr Gly Glu Lys Tyr 2000 2005 2010 Ala Gly Asn Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Ala 2015 2020 2025 Met Ala Leu Ile Gly Gly Pro Pro Val Val Phe Leu Asp Glu Pro 2030 2035 2040 Thr Thr Gly Met Asp Pro Lys Ala Arg Arg Phe Leu Trp Asn Cys 2045 2050 2055 Ala Leu Ser Val Val Lys Glu Gly Arg Ser Val Val Leu Thr Ser 2060 2065 2070 His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Met Ala Ile 2075 2080 2085 Met Val Asn Gly Arg Phe Arg Cys Leu Gly Ser Val Gln His Leu 2090 2095 2100 Lys Asn Arg Phe Gly Asp Gly Tyr Thr Ile Val Val Arg Ile Ala 2105 2110 2115 Gly Ser Asn Pro Asp Leu Lys Pro Val Gln Asp Phe Phe Gly Leu 2120 2125 2130 Ala Phe Pro Gly Ser Val Pro Lys Glu Lys His Arg Asn Met Leu 2135 2140 2145 Gln Tyr Gln Leu Pro Ser Ser Leu Ser Ser Leu Ala Arg Ile Phe 2150 2155 2160 Ser Ile Leu Ser Gln Ser Lys Lys Arg Leu His Ile Glu Asp Tyr 2165 2170 2175 Ser Val Ser Gln Thr Thr Leu Asp Gln Val Phe Val Asn Phe Ala 2180 2185 2190 Lys Asp Gln Ser Asp Asp Asp His Leu Lys Asp Leu Ser Leu His 2195 2200 2205 Lys Asn Gln Thr Val Val Asp Val Ala Val Leu Thr Ser Phe Leu 2210 2215 2220 Gln Asp Glu Lys Val Lys Glu Ser Tyr Val 2225 2230 141 574 PRT Homo sapiens 141 Met Pro Ser Ala Gly Thr Leu Pro Trp Val Gln Gly Ile Ile Cys Asn 1 5 10 15 Ala Asn Asn Pro Cys Phe Arg Tyr Pro Thr Pro Gly Glu Ala Pro Gly 20 25 30 Val Val Gly Asn Phe Asn Lys Ser Ile Val Ala Arg Leu Phe Ser Asp 35 40 45 Ala Arg Arg Leu Leu Leu Tyr Ser Gln Lys Asp Thr Ser Met Lys Asp 50 55 60 Met Arg Lys Val Leu Arg Thr Leu Gln Gln Ile Lys Lys Ser Ser Ser 65 70 75 80 Asn Leu Lys Leu Gln Asp Phe Leu Val Asp Asn Glu Thr Phe Ser Gly 85 90 95 Phe Leu Tyr His Asn Leu Ser Leu Pro Lys Ser Thr Val Asp Lys Met 100 105 110 Leu Arg Ala Asp Val Ile Leu His Lys Val Phe Leu Gln Gly Tyr Gln 115 120 125 Leu His Leu Thr Ser Leu Cys Asn Gly Ser Lys Ser Glu Glu Met Ile 130 135 140 Gln Leu Gly Asp Gln Glu Val Ser Glu Leu Cys Gly Leu Pro Arg Glu 145 150 155 160 Lys Leu Ala Ala Ala Glu Arg Val Leu Arg Ser Asn Met Asp Ile Leu 165 170 175 Lys Pro Ile Leu Arg Thr Leu Asn Ser Thr Ser Pro Phe Pro Ser Lys 180 185 190 Glu Leu Ala Glu Ala Thr Lys Thr Leu Leu His Ser Leu Gly Thr Leu 195 200 205 Ala Gln Glu Leu Phe Ser Met Arg Ser Trp Ser Asp Met Arg Gln Glu 210 215 220 Val Met Phe Leu Thr Asn Val Asn Ser Ser Ser Ser Ser Thr Gln Ile 225 230 235 240 Tyr Gln Ala Val Ser Arg Ile Val Cys Gly His Pro Glu Gly Gly Gly 245 250 255 Leu Lys Ile Lys Ser Leu Asn Trp Tyr Glu Asp Asn Asn Tyr Lys Ala 260 265 270 Leu Phe Gly Gly Asn Gly Thr Glu Glu Asp Ala Glu Thr Phe Tyr Asp 275 280 285 Asn Ser Thr Thr Pro Tyr Cys Asn Asp Leu Met Lys Asn Leu Glu Ser 290 295 300 Ser Pro Leu Ser Arg Ile Ile Trp Lys Ala Leu Lys Pro Leu Leu Val 305 310 315 320 Gly Lys Ile Leu Tyr Thr Pro Asp Thr Pro Ala Thr Arg Gln Val Met 325 330 335 Ala Glu Val Asn Lys Thr Phe Gln Glu Leu Ala Val Phe His Asp Leu 340 345 350 Glu Gly Met Trp Glu Glu Leu Ser Pro Lys Ile Trp Thr Phe Met Glu 355 360 365 Asn Ser Gln Glu Met Asp Leu Val Arg Met Leu Leu Asp Ser Arg Asp 370 375 380 Asn Asp His Phe Trp Glu Gln Gln Leu Asp Gly Leu Asp Trp Thr Ala 385 390 395 400 Gln Asp Ile Val Ala Phe Leu Ala Lys His Pro Glu Asp Val Gln Ser 405 410 415 Ser Asn Gly Ser Val Tyr Thr Trp Arg Glu Ala Phe Asn Glu Thr Asn 420 425 430 Gln Ala Ile Arg Thr Ile Ser Arg Phe Met Glu Cys Val Asn Leu Asn 435 440 445 Lys Leu Glu Pro Ile Ala Thr Glu Val Trp Leu Ile Asn Lys Ser Met 450 455 460 Glu Leu Leu Asp Glu Arg Lys Phe Trp Ala Gly Ile Val Phe Thr Gly 465 470 475 480 Ile Thr Pro Gly Ser Ile Glu Leu Pro His His Val Lys Tyr Lys Ile 485 490 495 Arg Met Asp Ile Asp Asn Val Glu Arg Thr Asn Lys Ile Lys Asp Gly 500 505 510 Tyr Trp Asp Pro Gly Pro Arg Ala Asp Pro Phe Glu Asp Met Arg Tyr 515 520 525 Val Trp Gly Gly Phe Ala Tyr Leu Gln Asp Val Val Glu Gln Ala Ile 530 535 540 Ile Arg Val Leu Arg Ala Pro Arg Arg Lys Leu Val Ser Ile Cys Asn 545 550 555 560 Arg Cys Pro Ile Pro Val Thr Leu Met Thr Ser Phe Cys Gly 565 570 142 20 DNA Artificial Oligonucleotide Primer 142 aaaccagaca gtagtggacg 20 143 20 DNA Artificial Oligonucleotide Primer 143 gttactgcca ccagaacagc 20 144 20 DNA Artificial Oligonucleotide Primer 144 tgataagctg ttctggtggc 20 145 20 DNA Artificial Oligonucleotide Primer 145 cttggctttt gcattgttgc 20 146 20 DNA Artificial Oligonucleotide Primer 146 caatgcaaaa gccaagaaag 20 147 19 DNA Artificial Oligonucleotide Primer 147 tgcaacgatg ccatatcac 19 148 22 DNA Artificial Oligonucleotide Primer 148 caactcctta cttcggttcc tc 22 149 21 DNA Artificial Oligonucleotide Primer 149 gttttctgag gtgtcccaaa g 21 150 6 DNA Artificial Polyadenylation Sequence 150 attaaa 6 151 14 DNA Artificial DNA Oligonucleotide 151 tgagaggaag ttct 14 152 38 PRT Artificial Fragment of mutated ABC1 polypeptide 152 Glu Tyr Ser Gly Val Thr Ser Ala His Cys Asn Leu Cys Leu Leu Ser 1 5 10 15 Ser Ser Asp Ser Arg Ala Ser Ala Ser Gln Val Ala Gly Ile Thr Ala 20 25 30 Pro Ala Thr Thr Pro Gly 35 153 26 PRT Artificial Fragment of mutated ABC1 polypeptide 153 Arg Ala Pro Arg Arg Lys Leu Val Ser Ile Cys Asn Arg Cys Pro Ile 1 5 10 15 Pro Val Thr Leu Met Thr Ser Phe Cys Gly 20 25 154 6 PRT Artificial Fragment of ABC1 polypeptide 154 Asp Glu Arg Lys Phe Trp 1 5 155 20 DNA Artificial Oligonucleotide Primer 155 ctacccaccc tatgaacaac 20 156 20 DNA Artificial Oligonucleotide Primer 156 gccaccccgt atgaacaggg 20 157 47 DNA Artificial Multiple Cloning olignucleotide sequence 157 cggccgcggc gcgcccggac cgcctaggat ttaaatcgcg gcccgcg 47 158 69 DNA Artificial Multiple Cloning Oligonucleotide Sequence 158 gctctagaat tcggcctccg tggccgttta aacgctagcg cccgggctta attaagtcga 60 ctctagagc 69 159 27 PRT Artificial Synthetic Peptide derived from ABC1 polypeptide 159 Leu His Lys Asn Gln Thr Val Val Asp Val Ala Val Leu Thr Ser Phe 1 5 10 15 Leu Gln Asp Glu Lys Val Lys Glu Ser Tyr Val 20 25 

1. Nucleic acid comprising at least 245 consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 1-14, or a nucleic acid having a complementary sequence.
 2. Nucleic acid comprising a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 15-47, or a nucleic acid having a complementary sequence.
 3. Nucleic acid comprising at least 8 consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 48-90, or a nucleic acid having a complementary sequence.
 4. Nucleic acid having at least 80% nucleotide identity with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 48-90, or a nucleic acid having a complementary sequence.
 5. Nucleic acid hybridizing, under high stringency conditions, with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 48-90, or a nucleic acid having a complementary sequence.
 6. Nucleic acid comprising a polynucleotide having the sequence SEQ ID NO 91, or a nucleic acid having a complementary sequence.
 7. Nucleic acid comprising at least eight consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 92, a biologically active fragment thereof or a nucleic acid having a complementary sequence.
 8. Nucleic acid having at least 80% nucleotide identity with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 92, a biologically active fragment thereof or a nucleic acid having a complementary sequence.
 9. Nucleic acid hybridizing, under high stringency conditions, with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 92, a biologically active fragment thereof or a nucleic acid having a complementary sequence.
 10. Nucleic acid having at least eight consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 93-96, or a nucleic acid having a complementary sequence.
 11. Nucleic acid having at least 80% nucleotide identity with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 93-96, or a nucleic acid having a complementary sequence.
 12. Nucleic acid hybridizing, under high stringency conditions, with a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 93-96, or a nucleic acid having a complementary sequence.
 13. Nucleic acid having at least eight consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 97-108 and comprising the polymorphic base, or a nucleic acid having a complementary sequence.
 14. Nucleotide probe or primer specific for the ABC1 gene, having a length of at least 15 nucleotides, chosen from the nucleic acids according to any one of claims 1 to
 9. 15. Probe or primer according to claim 14, comprising a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 109-138, or a nucleic acid having a complementary sequence.
 16. Nucleotide probe or primer useful for the detection of a mutation in the ABC1 gene, having a length of at least 15 nucleotides, chosen from the nucleic acids according to either of claims 11 and
 12. 17. Nucleotide probe or primer according to claim 16, comprising a polynucleotide chosen from the nucleotide sequences SEQ ID NO 109-112, or a nucleic acid having a complementary sequence.
 18. Nucleotide probe or primer useful for the detection of a polymorphism in the ABC1 gene, having a length of at least 15 nucleotides, chosen from the nucleic acids according to claim
 13. 19. Probe or primer according to claim 18, comprising a polynucleotide chosen from the nucleotide sequences SEQ ID NO 142-149, or a nucleic acid having a complementary sequence.
 20. Nucleotide primer comprising at least 15 consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 97-108 or of their complementary sequences, the base of the 3′ end of these primers being complementary to the nucleotide located immediately on the 5′ side of the polymorphic base of one of the sequences SEQ ID NO 97-108 or of their complementary sequences.
 21. Nucleotide primer comprising at least 15 consecutive nucleotides of a polynucleotide chosen from the group consisting of the nucleotide sequences SEQ ID NO 97-108 or of their complementary sequences, the base of the 3′ end of these primers being complementary to a nucleotide situated at 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 nucleotides or more on the 5′ side of the polymorphic base of one of the sequences SEQ ID NO 97-108 or of their complementary sequences.
 22. Method of amplifying a nucleic acid according to any one of claims 1 to 13 contained in a sample, said method comprising the steps of: a) bringing the sample in which the presence of the target nucleic acid is suspected into contact with a pair of nucleotide primers whose hybridization position is located respectively on the 5′ side and on the 3′ side of the region of the target nucleic acid whose amplification is sought, in the presence of the reagents necessary for the amplification reaction; and b) detecting the amplified nucleic acids.
 23. Method of amplification according to claim 22, characterized in that the nucleotide primers are chosen from the primers according to any one of claims 14 to
 19. 24. Box for amplifying a nucleic acid according to any one of claims 1 to 13 comprising: a) a pair of nucleotide primers whose hybridization position is located respectively on the 5′ side and 3′ side of the target nucleic acid whose amplification is sought; b) where appropriate, the reagents necessary for the amplification reaction.
 25. Box for amplifying a nucleic acid according to claim 22, characterized in that the nucleotide primers are chosen from the group consisting of the primers according to any one of claims 14 to
 19. 26. Nucleotide probe according to any one of claims 14 to 19, characterized in that it comprises a marker compound whose presence is detectable.
 27. Method of detecting the presence of a nucleic acid according to any one of claims 1 to 13 in a sample, said method comprising the steps of: a) bringing one or more nucleic probes according to one of claims 14 to 19 into contact with the sample to be tested; b) detecting the complex which may have formed between the probe(s) and the nucleic acid present in the sample.
 28. Method of detection according to claim 27, characterized in that the probe(s) are immobilized on a support.
 29. Box for detecting the presence of a nucleic acid according to any one of claims 1 to 13 in a sample, said box comprising: a) one or more nucleotide probes according to any one of claims 14 to 19; b) where appropriate, the reagents necessary for the hybridization reaction.
 30. Box for detection according to claim 29, characterized in that the probe(s) are immobilized on a suppport.
 31. Recombinant vector comprising a nucleic acid according to one of claims 1 to
 13. 32. Vector according to claim 31, characterized in that it is an adenovirus.
 33. Vector according to either of claims 32 and 33, characterized in that it is ABC1-rldV
 34. Recombinant host cell comprising a nucleic acid according to one of claims 1 to 13 or a recombinant vector according to one of claims 31 to
 33. 35. Mutated ABC1 polypeptide, characterized in that it comprises a polypeptide having the amino acid sequence SEQ ID NO
 140. 36. Mutated ABC1 polypeptide, characterized in that it comprises a polypeptide having the amino acid sequence SEQ ID NO
 141. 37. Antibody directed against a mutated ABC1 polypeptide according to either of claims 35 and 36, or a peptide fragment thereof.
 38. Antibody according to claim 37, characterized in that it comprises a detectable compound.
 39. Method of detecting the presence of a polypeptide according to either of claims 35 and 36 in a sample, comprising the steps of: a) bringing the sample into contact with an antibody according to either of claims 37 and 38; b) detecting the antigen/antibody complex formed.
 40. Diagnostic box for detecting the presence of a polypeptide according to either of claims 35 and 36 in a sample, said box comprising: a) an antibody according to either of claims 37 and 38; b) a reagent allowing the detection of the antigen/antibody complexes formed.
 41. Pharmaceutical composition intended for the prevention of or treatment of subjects affected by, a dysfunction in the reverse transport of cholesterol, comprising a nucleic acid according to either of claims 1 and 6, in combination with one or more physiologically compatible excipients.
 42. Pharmaceutical composition intended for the prevention of or treatment of subjects affected by, a dysfunction in the reverse transport of cholesterol, comprising a recombinant vector according to claim 31, in combination with one or more physiologically compatible excipients.
 43. Use of a nucleic acid according to one of claims 1 and 6 for the manufacture of a medicament intended for the prevention of Atherosclerosis in various forms or more particularly for the treatment of subjects affected by a dysfunction in the reverse transport of cholesterol.
 44. Use of a recombinant vector according to claim 31 for the manufacture of a medicament intended for the prevention of Atherosclerosis in various forms or more particularly for the treatment of subjects affected by a dysfunction in the reverse transport of cholesterol.
 45. Use according to claim 44, characterized in that the vector is ABC1-rldV.
 46. Use of the ABC1 polypeptide having the sequence SEQ ID NO 139 for the manufacture of a medicament intended for the prevention of Atherosclerosis in various forms or more particularly for the treatment of subjects affected by a dysfunction in the reverse transport of cholesterol.
 47. Pharmaceutical composition for the prevention of or treatment of subjects affected by, a dysfunction in the reverse transport of cholesterol, comprising a therapeutically effective quantity of the polypeptide having the sequence SEQ ID NO
 139. 48. Use of the ABC1 polypeptide, or of cells expressing the ABC1 polypeptide, for screening active ingredients for the prevention or treatment of diseases resulting from a dysfunction in the reverse transport of cholesterol.
 49. Method of screening a compound active on the metabolism of cholesterol, an agonist or antagonist of the ABC1 polypeptide, said method comprising the following steps: a) preparing membrane vesicles containing the ABC1 polypeptide and a lipid substrate comprising a detectable marker; b) incubating the vesicles obtained in step a) with an agonist or antagonist candidate compound; c) qualitatively and/or quantitatively measuring the release of the lipid substrate comprising a detectable marker; d) comparing the measurement obtained in step b) with a measurement of the release of the labeled lipid substrate by vesicles which have not been previously incubated with the agonist or antagonist candidate compound.
 50. Method of screening a compound active on the metabolism of cholesterol, an agonist or antagonist of the ABC1 polypeptide, said method comprising the following steps: a) obtaining cells, for example a cell line, expressing naturally or after transfection the ABC1 polypeptide; b) incubating the cells of step a) in the presence of an anion labeled with a detectable marker; c) washing the cells of step b) in order to remove the excess of the labeled anion which has not penetrated into these cells; d) incubating the cells obtained in step c) with an agonist or antagonist candidate compound for the ABC1 polypeptide; e) measuring the efflux of the labeled anion; f) comparing the value of the efflux of the labeled anion determined in step e) with the value of the efflux of the labeled anion measured with cells which have not been previously incubated in the presence of the agonist or antagonist candidate compound for the ABC1 polypeptide.
 51. Method of screening a compound active on the metabolism of cholesterol, an agonist or antagonist of the ABC1 polypeptide, said method comprising the following steps: a) culturing cells of a human monocytic line in an appropriate culture medium, in the presence of purified human albumin; b) incubating the cells of step a) simultaneously in the presence of a compound stimulating the production of IL-1 beta and of the agonist or antagonist candidate compound; c) incubating the cells obtained in step b) in the presence of an appropriate concentration of ATP; d) measuring IL-1 beta released into the cell culture supernatant. e) comparing the value of the release of the IL-1 beta obtained in step d) with the value of the IL-1 beta released into the culture supernatant of cells which have not been previously incubated in the presence of the agonist or antagonist candidate compound. 